Shareware Grab Bag

home *** CD-ROM | disk | FTP | other *** search

/ Shareware Grab Bag / Shareware Grab Bag.iso / 090 / sampler5.arc / SAMPLER.DOC

Wrap

Text File | 1985-10-12 | 215KB | 5,797 lines

(tm) The SORITEC Sampler Version 1.06B-1.67 From The Sorites Group, Inc. PO Box 2939 Springfield, VA 22152 March 13, 1985 TABLE OF CONTENTS Chapter 1 Introduction........................ 6 1.0 Introduction.......................... 6 1.1 What is SORITEC?...................... 6 1.2 SORITEC Sampler....................... 7 1.3 Getting Started....................... 8 1.4 Invoking SORITEC Sampler.............. 9 1.4.1 Interactive Processing.............. 9 1.4.2 Batch Processing.................... 10 1.5 Executing SAC Files................... 10 1.6 SORITEC Input Journal Files........... 11 Chapter 2 SORITEC Syntax...................... 12 2.0 Introduction.......................... 12 2.1 Variable Names........................ 12 2.2 Special Symbols....................... 12 2.3 Variable Types........................ 13 2.4 Selection of the Observation Set...... 15 2.4.1 Conditional Selection of the Observation Period.............. 15 2.5 Transformations....................... 16 2.6 Revising Data in SORITEC.............. 18 2.7 Missing Data Handling................. 19 2.7.1 Missing Value Symbol Declaration.... 20 2.7.2 Missing Value Logical Function...... 20 2.7.3 Imputation of Missing Values........ 21 2.8 Wildcards............................. 21 2.9 Options............................... 22 2.10 Recovering Internal SORITEC Variables........................... 22 2.11 SORITEC's Symbol Table................ 23 2.12 Minor Control Statements.............. 24 2.12.1 Specify Width of Output Device..... 24 2.12.2 Change Length of Input Line........ 24 2.12.3 Reset Maximum Error Limit.......... 25 2.12.4 Turn Batch Listing On or Off....... 25 2.12.5 Label Batch Output Pages........... 25 2 Chapter 3 Data Entry and Output............... 26 3.0 Introduction.......................... 26 3.1 SORITEC Alternate Load (SAL) Files.... 26 3.1.1 SAL File Input...................... 27 3.1.2 SAL File Output..................... 27 3.2 Data Interchange Format (DIF) Files... 28 3.2.1 DIF File Input...................... 28 3.2.2 DIF File Output..................... 30 3.3 Formatted Input and Output............ 31 3.3.1 FORTRAN Formatted Input............. 31 3.3.2 FORTRAN Formatted Output............ 32 3.4 Keyboard Entry........................ 33 3.5 Output of Data to the Terminal........ 34 3.5.1 Tabular Display..................... 34 3.5.2 Graphical Display................... 34 3.6 SORITEC DataBank Files................ 36 Chapter 4 SORITEC Databank (SDB) Files........ 37 4.0 Introduction.......................... 37 4.1 Create a Databank..................... 37 4.2 Access a Databank..................... 37 4.3 Release a Databank from SORITEC....... 38 4.4 Purge a Databank...................... 38 4.5 Retrieve Items from a Databank........ 38 4.6 Store Items in a Databank............. 39 4.7 Replace Items in a Databank........... 39 4.8 Rename Items in a Databank............ 39 4.9 Switch the Names of Two Items in a Databank...................... 40 4.10 Discard Items from a Databank......... 40 4.11 Generate a Directory Listing of a Databank...................... 40 Chapter 5 Programming Constructs.............. 41 5.0 Introduction.......................... 41 5.1 Numeric Looping....................... 41 5.2 Unconditional Branching............... 42 5.3 Conditional Branching................. 43 5.4 Null (Continuation) Statement......... 43 5.5 Alpha Looping......................... 43 3 Chapter 6 Dummy Data Series Generation and Special Transformation Commands..... 45 6.0 Introduction......................... 45 6.1 Create a Time Trend Dummy Series..... 45 6.2 Create Seasonal Dummies.............. 45 6.3 Recode a Variable.................... 46 6.4 Conversion of Time-Series from One Periodicity to Another........ 46 6.5 Maximum Function..................... 47 6.6 Minimum Function..................... 48 6.7 Modular Division..................... 48 6.8 Compute Moving Average............... 49 6.9 Compute Moving Sum................... 49 6.10 Statistical Operations............... 49 6.10.1 Correlation Matrix Calculation.... 49 6.10.2 Covariance Matrix Calculation..... 49 6.10.3 Other Statistical Operations...... 50 Chapter 7 SORITEC Financial Functions......... 51 7.0 Financial Functions in SORITEC........ 51 7.1 Internal Rate of Return............... 51 7.2 Present Value......................... 52 7.3 Loan Amortization..................... 53 Chapter 8 SORITEC Sampler Cross-Section Techniques.......................... 55 8.0 Introduction.......................... 55 8.1 Synopsis.............................. 55 8.2 Crosstabulation Analysis.............. 56 Chapter 9 Estimation and Forecasting with SORITEC Sampler..................... 57 9.0 Introduction.......................... 57 9.1 Ordinary Least Squares (OLS) Estimation......................... 57 9.2 Autocorrelation Techniques for the Single Equation Model.............. 58 9.2.1 Cochrane-Orcutt Iterative Technique. 58 9.2.2 Hildreth-Lu Scanning Technique...... 58 9.3 Two-Stage Least Squares (2SLS) Estimation......................... 59 9.4 Forecasting Single Equation Models.... 59 4 Chapter 10 SORITEC Interactive Print Server... 62 10.0 Introduction......................... 62 10.1 Entering Tableau Mode................ 62 10.2 Tableau Descriptions................. 63 10.2.1 Coefficient Display............... 63 10.2.2 Regression Summary Table.......... 63 10.2.3 Residual Autocorrelation Summary.. 63 10.2.4 PDF and Histogram of Standardized Residuals............ 63 10.2.5 Non-Parametric Residual Distribution Tests................ 63 10.2.6 Regression ANOVA Table............ 64 10.2.7 Covariance Matrix of Coefficient Estimates............. 64 10.2.8 Correlation Matrix of Coefficient Estimates............. 64 10.2.9 Beta Coefficients, Elasticities and Partial R..................... 64 10.2.10 Statistical Summary of Exogenous Variables............... 64 10.2.11 Actual vs Fitted Plot and Standardized Residuals............ 64 10.3 Interactive Crosstabs................ 65 APPENDIX I SORITEC INTERNAL SYSTEM NAMES.... 66 APPENDIX II GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC........... 69 APPENDIX III QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS...... 73 APPENDIX IV DETAILED FEATURE LIST FOR SORITEC VERSION 1.06B................. 76 INDEX SORITEC INFORMATION REQUEST FORM 5 Chapter 1 Introduction 1.0 Introduction This econometrics package, called SORITEC Sampler, is provided to you free of charge from the Sorites Group, Inc. (SGI) of Springfield, Virginia. SGI is a software engineering firm which has been developing and supporting a machine-independent econometric modeling package since 1978. Our package, called SORITEC (SORITes EConometric) is now supported on 22 different mainframes, minicomputers and microcomputers and still has only one reference manual. The program's command syntax is identical on all machines. In the spring of 1984, we made our first port to the IBM PC. Unlike other econometric packages for microcomputers, the full version of SORITEC for the PC is not a subset of the mainframe package. In order for the program to operate, the full power of the IBM PC/XT, PC/AT or compatible computer must be available. This means that your system must have a hard disk and 512K of RAM. The 8087 math co-processor is required for the full version of SORITEC. The availability of advanced econometric and statistical techniques including full information maximum likelihood (FIML), and non-linear simul- taneous equations estimation and simulation for a fraction of the price of similar capabilities on a mainframe has put us in the forefront of using the full potential of the IBM PC and compatibles. Given the increasing power and declining costs of micro-computers, our original belief in the need for a machine-independent econometrics package has proved correct. 1.1 What is SORITEC? SORITEC is a sophisticated econometric modeling and forecasting system that allows you to estimate or solve and simulate almost any mathematical model that you might specify. The program enables you to do econometric time-series analysis within an easy-to-use command language syntax. Those of you familiar with TSP will find SORITEC's command language similar. SORITEC can handle models with hundreds of equations, either linear or non-linear, in either a static or dynamic framework. Model systems can be specified, built, rearranged, databanked and manipulated by name. Once a model is constructed, it can be recalled and resimulated by a single com- mand. SORITEC provides a report writer capable of providing detailed and complex reports with minimal effort and training. SORITEC is also a com- plete data processing language that lets you do varied and complex data reduction operations easily. The combination of its econometric methods and report writing capabilities permits SORITEC to handle most current production reporting automatically via command files. SORITEC also con- 6 tains most of the useful statistical functions of the leading mainframe statistics packages. As new versions of SORITEC are released, we expect it will soon exceed the pure statistical capabilities of these packages. Appendix IV of this document provides a complete list of features in SORITEC. A new version of SORITEC (Version 1.06B), available in February 1985, incorporates significant enhancements to the system's analytical capabili- ties and user friendliness, including multivariate techniques such as PROBIT, CROSSTABS and ANOVA, the most complete set of regression diagnos- tics available, and tableau-oriented regression output. Mainframe and minicomputer versions of SORITEC are available to universities for teaching purposes free of charge (except for a small processing fee). Contact SGI or a local distributor for prices. 1.2 SORITEC Sampler SORITEC Sampler is a subset of the full SORITEC and is equivalent to econometric packages sold today for $200 to $400. It is supplied free of charge, and may be reproduced and distributed freely as long as no fee is charged and no alterations are made. The program requires 384K of random access memory and can run off floppy diskettes. The 8087 math coprocessor is recommended. We decided to give SORITEC Sampler away for several reasons. First, distributing a version of SORITEC which is useful, free and reproducible is a cost-effective method of advertising this type of product. You are encouraged to make as many copies as you wish and pass them on to friends and colleagues. Second, we needed a demo copy to illustrate SORITEC's command structure, data handling capabilities and techniques. Rather than sending out a demo disk that simply went through some song and dance without allowing you to really "touch" the package, we figured that a "live", though limited, version of the real thing would be an excellent demonstration of SORITEC's features. Lastly, SORITEC Sampler is based on the belief that there is no justification for charging to estimate single equation models. The techniques described in this Reference Manual are those supported by SORITEC Sampler. However, they function identically to those in SORITEC. SORITEC Sampler provides a useful introductory econometrics package that will encourage people to apply econometric and statistical techniques. In return, we hope that you will consider us when you want more econometric capability on your computer and will help spread the word about SORITEC by passing this package around. We, in turn, will continue to reinvest our revenues in product development instead of elaborate advertising. You can obtain the latest release of SORITEC Sampler plus a bound copy of this Reference Manual and the full SORITEC Reference Manual by sending 7 (U.S.)$50.00 to SGI. Consult the form at the end of this document for further details. Note that SORITEC Sampler is NOT a supported product and is distributed without warranties of MERCHANTABILITY and FITNESS FOR A PARTICULAR PURPOSE. 1.3 Getting Started SORITEC Sampler is distributed on two diskettes; a third diskette contains this documentation and examples. You can make a backup of the diskettes for safekeeping or distribution by using the DOS COPY command. Use COPY a:*.* b: to copy all files on the diskette to another diskette. If your system has a hard disk, use COPY a:*.* to copy all files on the diskette to your current directory. SORITEC Sampler requires at least 384 KB of RAM and DOS 2.0 to execute. An 8087 or 80287 math co-processor is optional, but recommended. The program may be run either from floppy diskettes or from a directory on a hard disk system. Sampler has some minimum system requirements which may require you to change your CONFIG.SYS file. The following commands must be included in the CONFIG.SYS file before running the program. DEVICE = ANSI.SYS FILES = 12 BUFFERS = 12 BREAK = ON To invoke SORITEC Sampler on systems without a hard disk, insert Disk 1 of 2 into the current drive and enter: SAMPLER followed by a carriage return. After a moment, you will be prompted to insert Disk 2 of 2 into the current drive. Replace the first diskette with the second and enter a carriage return. Once the SORITEC Sampler banner appears on the screen, follow the instructions displayed there. To invoke SORITEC Sampler on systems with a hard disk, you should first copy the SAMPLER.EXE from the first distribution disk, plus all .OVL files and the SAMPLER.FMT file from the second distribution diskette to a directory or subdirectory on the disk. Invoke the program, as outlined above, by entering SAMPLER and follow the instructions once the banner appears on the screen. Use the DOS PATH command to identify the subdirectory in which your SAMPLER.EXE, .OVL and .FMT files are stored if you want to invoke SORITEC Sampler from any other directory or subdirectory on your hard disk. Sampler will refer to its "home" subdirectory to load overlays, etc., but will look for all input files and write all output files, including the input journal 8 file, to the current directory unless another directory is explicitly specified in the SORITEC command. SORITEC Sampler also supports DOS redirection of standard input and output devices so that filename arguments may appear on the command line. Any combination of: SAMPLER [ < [d:][path]filename ] [ > [d:][path]filename ] [ >> [d:][path]filename ] are legal arguments in the command line. Refer to the "Advanced DOS Commands" chapter of your DOS manual for information about I/O redirection. DOS redirection is particularly useful with SORITEC's batch processing facility. Do not invoke DOS redirection if you are running SORITEC Sampler on a floppy disk system. 1.4 Invoking SORITEC Sampler SORITEC Sampler executes in both interactive and batch modes of proce- ssing. However, before describing how each mode is invoked, it is impor- tant to distinguish SORITEC interactive and batch processing modes from the foreground and background processing modes that are typically associated with these terms. When SORITEC is in interactive mode, the program takes each line of input and processes it as it is received. In batch processing mode, on the other hand, SORITEC accepts input lines until they are logically concluded with an END statement. At that point, batch job execu- tion begins. Note that SORITEC interactive and batch modes can run in both foreground and background processing environments. Batch job processing in SORITEC has certain characteristics that sometimes make it more convenient to use than interactive mode. First, it compiles a complete listing of the commands of a job and outputs it without line prompts to the output device before execution begins. This separates the command lines from the output and generally makes the output more presentable for reports, etc. Second, batch processing mode provides for the labelling of the job and the insertion of titles into the output listing. Batch processing mode is often useful when output is too wide to be displayed legibly on the terminal. Through DOS redirection and respeci- fication of the output width, output that would otherwise be difficult to read on a terminal can be routed to other output devices, such as line printers. Although most of these features can be replicated in interactive mode, it is generally more convenient as a batch job. 1.4.1 Interactive Processing SORITEC Sampler prompts you for input after the banner page has been passed. Prompts in SORITEC are of the form 1-- ,2-- ,3-- and so on. When the first prompt is returned, interactive processing is started by entering: HELLO 9 Sampler will respond by printing another banner with version information, date and time, default settings for input (SCAN) and output (WIDTH), and workspace size. After the second prompt has been displayed, you may enter any legal SORITEC Sampler command. Interactive processing is terminated by entering the command: QUIT Execution of QUIT closes and returns any files that are currently attached and returns control to DOS. All items in the user's workspace are irretrievably lost once the QUIT command is executed. 1.4.2 Batch Processing SORITEC identifies batch job processing through the JOB command. The JOB command consists of the command name and up to 120 characters of label information, i.e., JOB job_label The JOB command supplies unchangeable labelling information for the entire batch run. As such, only one JOB command may appear in any single job deck. The "job_label" may not contain the symbols ; , $ , or &. Batch processing is terminated by the END command which is entered simply as: END At the end of a JOB, the END statement instructs SORITEC to return and close any databank or other file which is attached. The user's workspace is irretrievably lost after the END statement is processed. Note that the END command has several uses in SORITEC. It is required at the end of SORITEC SAL files and to close DO loops and PROCEDURES. This does not mean that you cannot embed a command within a batch job that uses an END statement. SORITEC keeps track of END statements when compiling batch job statements and senses the end of a JOB only when it is logically compelled to do so. Descriptions of these other commands that use the END command are provided later in this documentation. 1.5 Executing SAC Files SORITEC accepts input from other than the terminal through a command file known as a SORITEC Alternate Command, or SAC, file. A SAC file is simply a DOS file that contains legal SORITEC commands. It may be struc- tured as a batch job for SORITEC's batch processing facility or may simply be a set of commands as you would enter them from the terminal. For SORITEC to recognize it as a SAC file, the filename must have a .SAC extension, i.e., the file must exist on your DOS directory as "filename.SAC". It can be constructed using any commercially available 10 text processor. SORITEC will execute command files at any point in an interactive processing session. Command file processing is started by entering: EXECUTE filename where "filename" is the name of the command file you wish to have executed. Do not enter the file extension with the filename on the EXECUTE command line. If the SAC file exists on a drive or directory other than the current one, it must be referenced within single quotations, i.e., EXECUTE 'd:filename' or EXECUTE '\path\filename' Command file output is always displayed on the terminal unless it has been redirected via DOS redirection. 1.6 SORITEC Input Journal Files Sampler will open an input journal file on the current directory, called SORITEC.JNL, if interactive processing mode is invoked and the ON JOURNAL option is enabled. This file stores all commands that are en- tered during a session so that you can archive the command sequence for future use. The file can later be executed as a SORITEC Alternate Command file. Journal files are particularly important for reviewing an interac- tive session for errors when results are not as expected. They can also be edited and re-executed to produce a "final draft" of a particular statisti- cal or estimation problem. Any file that exists as SORITEC.JNL on the current directory is auto- matically erased when a new journal file is written. Be sure to rename any journal files you wish to keep. Remember that you must change the filename extension to ".SAC" if you wish to EXECUTE it as a command file. Do not enable the JOURNAL option if you are running SORITEC Sampler off floppy disks as there is no room on Disk 2 for writing the file. If you attempt to do this, Sampler will print an error message and re-prompt you for a command. Unpredictable results could be obtained, however, in subsequent operations. 11 Chapter 2 SORITEC Syntax 2.0 Introduction SORITEC syntax has been constructed to make the entire package easy to learn and use. Typical SORITEC operations can be divided into two types of statements: commands and transformations. Before considering the command structure or allowable transforms, we need to consider the form of a SORITEC variable name. The most important fact to keep in mind when you are using SORITEC Sampler is that the language is "series" oriented rather than value oriented. In FORTRAN or BASIC, the statement X=Y sets the value of a VARIABLE X to the value of variable Y. In SORITEC, X = Y replaces the entire time-series X with the time-series Y. So, in SORITEC, single para- meter values are more the exception than the rule. 2.1 Variable Names SORITEC variable names are composed of the characters A-Z, the numbers 0-9 and the symbols @, %, ^, _, or :. The name MUST begin with a character and must be no more than 32 characters (or symbols) long. Mathematical operators may not be used in variable names. SORITEC allows dynamic leading or lagging of variables through sub- scripted arguments; for example, GNP(1) and GNP(-1) are the first lead and lag values of GNP. Arguments for lags or ranges may also be integer constants or SORITEC variables. In commands that expect multiple argu- ments, SORITEC will accept ranged values of leads and lags, e.g., GNP(+2 TO -3) is automatically expanded to GNP(2) GNP(1) GNP GNP(-1) GNP(-2) GNP(-3). Note that positive signing of lead arguments is optional. 2.2 Special Symbols SORITEC defines several special symbols to provide a simplifying shorthand in using the package. Currently, the symbols ;, !, ",", =, +, -, *, /, ., >, <, (, ), ?, &, and the string ... have special meaning in SORITEC. ; delimits each command when several commands are "stacked" on a single line, e.g., USE 1984M1 1984M6 ; PRINT GNP ! identifies comments in SORITEC. If entered in column 1, any text between the ! and the end of the line or line delimiter ";" is considered 12 to be a comment and is ignored by the interpreter. The ! symbol only functions as a comment identifier if placed in column 1. "," is used only as an argument separator and is interpreted as a blank everywhere except in a format statement. + - * / . < > and = are reserved for math operations. The parenthetical symbols, ( and ) , are reserved for designating command modifiers, arguments and subscripts. The symbols "*" and "?" are used as a wildcard references, described later in this chapter. Lastly, & and ... indicate that the current command continues onto the next line. 2.3 Variable Types There are seven types of variables in the SORITEC language. Time- series variables are the default data type in SORITEC, so a reference to variable X implicitly references the time-series X. Variable assignment implicitly assumes the variable is time-series unless you state otherwise; so, a simple statement such as x=2 creates a series of numbers all equal to 2, NOT a single value. The second most common data types are parameters and constants. Both are scalar values, but whereas parameter values can be changed by the SET command, constants cannot. Parameters and constants are created by the following statements. PARAMETER param_1 [value_1] param_2 [value_2] ... CONSTANT const_1 [value_1] const_2 [value_2] ... If the value associated with a parameter or constant is omitted, SORITEC sets it to zero. Parameters can be set or reset using any standard transformation by prefixing the transformation with the SET command, for example: PARAMETER a .5 b .3 SET a = a**0.5 * log(b) SORITEC also defines vector and matrix data types. These types are created by using the VECTOR or MMAKE commands, respectively. To create a vector, use the command: VECTOR vector_name value_1 value_2 ... For example, to create the vector BETA, you would type VECTOR BETA .5 .2 .1 -.5 SORITEC keeps track of the length of the vector when it is created. Indi- 13 vidual elements of a vector can be manipulated like scalar values in SORITEC commands using subscript notation. For example: SET ZERO=BETA(1)+BETA(4) would result in the value of the scalar ZERO to 0.0. Matrix data types are not supported in SORITEC Sampler. SORITEC also allows you to name and manipulate equations as a separate data construct using the EQUATION command. The form of the command is: EQUATION equation_name [equation] Equations are structured exactly as they are in FORTRAN. Equations can be stored in databases and can be computed by name, once values have been assigned to their parameters and variables. Use the COMPUTE command, which is of the form: COMPUTE equation_name to recompute values for the left-hand side variables. Note that the primary use of equations in SORITEC is for forecasting and for non-linear estimation. In SORITEC Sampler, you can only use equa- tions for forecasting or recomputing values, but not for estimation. The final data type is the GROUP. A GROUP is a namelist that speci- fies a set of names for further processing. The namelist is initialized by the GROUP command, which has the form: GROUP group_name name_1 name_2 ... name_k To extract the elements of a namelist, group expansion must first be enabled via the ON GROUP command. The group name is then replaced by the individual names in the namelist. This avoids the need to type the same set of names repeatedly. For example, the following commands greatly simplify testing the inclusion of variables in a regression equation. GROUP basic_variables GNP M1 TAXES GOV_EXP PRIME ON GROUP REGRESS DEFICIT basic_variables PARTY REGRESS DEFICIT basic_variables TIME ... etc You can also reference individual elements within a GROUP by index number. For example, you could reference "basic_variables(2)" in place of M1 in the example given above. Referencing individual namelist elements by index number is particularly useful in DO loops. 14 2.4 Selection of the Observation Set Periodicity and length of data series are defined by the USE command in SORITEC. The USE period defined by this command is active in all subsequent SORITEC commands until explicitly changed by another USE. Data need not be continuous over the range of observations, but instead may consist of a series of intervals. The form of the USE command is: USE [begin_1] [end_1] [begin_2] [end_2] ... USE requires zero, one or an even number of arguments which may be positive integers, constants, parameters or a vector. Each pair of arguments de- fines a range of observations within the overall observation range, "begin_1" to "end_n". The second argument must not be less than the first. If no arguments are included in the command line, SORITEC returns the currently active USE period. If only one argument is included in the command line, the end period is implicitly equated to the first. SORITEC allows you to define annual, semi-annual, quarterly, monthly, ten day, weekly, daily and undated data types. Periodicity of time-series data is defined by appending an appropriate suffix to the data year, as shown in the following table. PERIODICITY SUFFIX RANGE(x) ----------- ------ -------- Annual none -- Semi-annual Sx [1,2] Quarterly Qx [1,4] Monthly Mx [1,12] Ten Day Tx [1,37] Weekly Wx [1,52] Daily Dx [1,366] Undated none [1,1000] The permissible range of years in dated data types is 1901 to 2100. Note that Ten Day data consists of first and second ten-day periods of the month, and a remaining period of 8, 10 or 11 days. Weekly data span Sunday through Saturday. SORITEC Sampler will convert data series from type to another, but certain restrictions apply. Data conversion is discussed in Section 6.4. The following are examples of USE commands: USE 1980q1 1984q4; USE 1942m12 1955m6 Note that the command USE 1980 is equivalent to USE 1980 1980. 2.4.1 Conditional Selection of the Observation Period SORITEC also permits conditional selection of the sample period based on a logical variable. The format of the command is USEIF series 15 where "series" is an indicator series. The USEIF command resets the USE period to select only the observations corresponding to non-zero entries of "series". For example, to run a regression on all individuals with income between $12,000 and $24,500 NEW_SAMPLE = INCOME > 12000 .AND. INCOME < 24500 USEIF NEW_SAMPLE 2.5 Transformations The COMPUTE command is the basic SORITEC transformation command. The command line consists of the COMPUTE command name followed by one argument, which must be an EQUATION name or any legal SORITEC transformation expres- sion, i.e., COMPUTE equation_name or [COMPUTE] transformation_expression In the latter case, the COMPUTE command name may be omitted, e.g.. result = var_1 + var_2 Transformations are straightforward in SORITEC as syntax considerations conform to standard algebraic notation. Legal operators in SORITEC transformations are as follows: ARITHMETIC OPERATORS LOGICAL OPERATORS -------------------- ----------------- + add .eq. equal - subtract .ne. or <> or >< not-equal * multiply .ge. or >= or => greater-or-equal / divide .le. or <= or =< less-than-or-equal ** exponentiation .gt. or > greater-than .lt. or < less-than .not. negation .and. and-function .or. or-function Transformations can contain any of the mathematical functions listed below. LOG SINH Hyperbolic Sine or ALOG Natural Logarithm COSH Hyperbolic Cosine or LN TANH Hyperbolic Tangent ASINH Hyperbolic Arcsine ALOG10 ACOSH Hyperbolic Arccosine or L10 Logarithm Base 10 ATANH Hyberbolic Arctangent EXP Exponential constant CEILING Next Largest Integer ABS Absolute Value FLOOR Next Smallest Integer ROUND Round to Nearest Integer 16 SIN Sine COS Cosine SIGN Extract Sign (+1,0,-1) TAN Tangent TRUNC Truncate Fractional Part ASIN Arcsine ACOS Arccosine ATAN Arctangent Arguments associated with these functions must be enclosed in paren- theses. Note that there is no SQRT function in SORITEC. Use the more general form var**0.5 instead. Use of operators in SORITEC transformations must conform to the following conventions. (1) Two operators (+,-,.and.,.or., etc.) cannot occur in sequence unless separated by one or more open parentheses. (2) The number of open and closed parentheses must be equal. (3) The mathematical operators "*", "/" and "**" cannot occur immediately after an open parenthesis. (4) An operator cannot occur immediately before a closed parenthesis. Transformations are parsed according to standard programming conven- tions. Therefore, subformulae in parentheses are evaluated first, followed by all function evaluations, then all "**" operations, then all "*" and "/" operations, and lastly all "+" and "-" operations. Logical operators are evaluated after parentheses and mathematical operators. Within this group, mathematical comparisons (.eq., .ne. or <> or ><, .ge. or >= or =>, .le. or <= or =<, .gt. or >, .lt. or < ) are evaluated first, followed by logical negation (.not.), and lastly by .and. and .or.. When in doubt about the order of evaluation, use parentheses to avoid errors. Note that you can combine mathematical and logical operations in a single transformation. This allows complex conditional structures to be imbedded directly into equations and expressions in a highly flexible manner. For example, the expression y=log(x)*(b.gt.1)+x*(b.le.1) is a legal SORITEC transformation. The logical portion of the expansion is merely evaluated to 1 or 0 and then used in the computation. SORITEC Sampler does not handle some illegal transformations gracefully and, in these situations, can terminate sessions abruptly by exiting to DOS. For example, the transformation: A = (=) "crashes" the system and returns to the DOS command level. As all active items in SORITEC's workspace are irretrievably lost in this situation, you should avoid entering nonsense into SORITEC commands. Most common errors in transformations, such as unbalanced parentheses, however, cause warning statements to be issued but keep the current SORITEC session active. 17 2.6 Revising Data in SORITEC Data series may be extended or revised easily in SORITEC using the REVISE command and the USE command. The format of the command is similar to the COMPUTE command, i.e., REVISE transformation_expression A data item being REVISEd must have been previously defined in SORITEC. The command cannot be used to initialize the variable. REVISE updates a variable by temporarily deactivating values for the variable that lie outside the range of the currently active USE period. In other words, to update a data series you must first define the observations of the series that you wish to revise with the USE command before changing the data with the REVISE command. For example, revision of the third observation of an undated series "old_data", defined below, requires the following commands to generate the series on the right: OLD_DATA ................ FILL old_data 1 2 3 4 5 . USE 3 3 1 . 1.00000 REVISE old_data=3.5 2 . 2.00000 PRINT old_data 3 . 3.50000 4 . 4.00000 5 . 5.00000 Since any legal transformation is permitted as an argument, the right hand side of the equation can be a constant, time-series or other valid SORITEC expression. Revision of the third and fourth observations of the original "old_data", for example, requires the following commands to pro- duce the output on the right: OLD_DATA USE 3 4 ................ FILL new_data 4 5 . REVISE old_data = new_data - 1.5 1 . 1.00000 USE 1 5 2 . 2.00000 PRINT old_data 3 . 2.50000 4 . 3.50000 5 . 5.00000 Extending a data series by one or more observations simply requires redefining the USE period to the period you wish to update and revising the data as before. For example, the output on the right is produced by the following commands: 18 OLD_DATA USE 6 6 ................ REVISE old_series = 6 . USE 1 6 1 . 1.00000 PRINT old_series 2 . 2.00000 3 . 3.00000 4 . 4.00000 5 . 5.00000 6 . 6.00000 A similar procedure is used when splicing two series together. For example, the command sequence on the left splices observations 6 through 10 of "new_data" to the original five observations of "old_data". OLD_DATA USE 6 10 ................ FILL new_data 6 7 8 9 10 . REVISE old_data = new_data 1 . 1.00000 USE 1 10 2 . 2.00000 PRINT old_data 3 . 3.00000 4 . 4.00000 5 . 5.00000 6 . 6.00000 7 . 7.00000 8 . 8.00000 9 . 9.00000 10 . 10.0000 Data revision can also be automatically implemented through the COMPUTE and FILL commands by enabling the ON REVISE global option. (The FILL command is described in Chapter 3.) Values for data in the currently active USE period are overwritten when these commands are executed, but values outside the USE period are retained, until an OFF REVISE command is encountered. 2.7 Missing Data Handling In general, SORITEC does not do casewise or any other type of dele- tion when it encounters MISSING data. Instead, an error message is printed and zero is used in all contexts except transformations. An excep- tion to this rule occurs in the cross-sectional procedures. Here, categori- cal techniques treat missing data as a separate category while SYNOPSIS, non-parametric and other statistical techniques ignore missing values. Several enhancements to missing value handling have been added to SORITEC. 19 (1) SORITEC generates a MISSING value in transforma- tions that involve MISSING data, except when MISSING data are multiplied by zero. Here, a zero value for the transformation results. (2) The PUNCH command now generates the word 'MISSING' for each missing value. (3) The READ command now recognizes the words 'MISSING' and 'NA' in input data. (4) A MISSING command has been added that allows you to assign a missing value to a SORITEC constant. (5) A LEGAL function has been added that scans a data item for missing values. The operation of the MISSING command and LEGAL function are described below. 2.7.1 Missing Value Symbol Declaration SORITEC constants can be assigned missing values with the MISSING command. The syntax of the command is: MISSING constant_name The argument "constant_name" is defined to be a SORITEC constant with the value MISSING assigned. Only one argument is permitted in the command line. Regardless of its prior type, the argument is always redefined as a SORITEC constant. MISSING cannot assign a missing value to any other variable type. You can, however, assign missing values to other variable types using the REVISE command, as the following example shows. The commands... yield USE 1 3 SERIES FILL SERIES 1 2 3 ............. USE 3 1 . 1.00000 MISSING X 2 . 2.00000 REVISE SERIES=X 3 . MISSING USE 1 3 PRINT SERIES 2.7.2 Missing Value Logical Function The LEGAL function returns the value 1 if a data item is not MISSING and zero otherwise. This enables easy conversion of MISSING values to another value. 20 2.7.3 Imputation of Missing Values SORITEC Sampler provides four options for replacing missing values. Missing values may be substituted by zero, the series mean, the interpo- lated value or the trend forecast. The option is set globally by the IMPUTE command, i.e., IMPUTE [ZERO|MEAN|INTER|TREND|NONE] Normal missing value processing is resumed when the NONE option is executed. Entering the command IMPUTE with no arguments returns the option currently in effect. The details of each option are as follows: ZERO substitutes 0 for each missing observation MEAN replaces each observation with the mean of the series during the current use period INTER interpolates the range between the last two known non-MISSING values over the missing observations TREND fills in missing values with the simple trend forecast for the series over the current use period. NONE stops implicit imputation of missing values 2.8 Wildcards SORITEC now supports the '*' and '?' symbols as wildcard characters in arguments. The wildcarding scheme is a simple way to reduce the time spent typing and viewing output (e.g. from the SYMBOLS command, described later in this chapter). Currently, wildcards are available for use with the FORGET, GROUP, and SYMBOLS commands. The rules for wildcard construction are simple. An asterisk repre- sents zero or more alphanumeric characters and a question mark substi- tutes for any single character. Commands which permit wildcards match all the names in the local workspace against the wildcard pattern and expand the command line appropriately. The following examples explain wildcard processing in SORITEC. Assume that the local workspace contains the variables X, XY, XXY, BBYB, BB, ABXYZ, and ABXY. Then: THESE WILDCARDS: WOULD REFERENCE THESE ITEMS: * X, XY, XXY, BBYB, BB, ABXYZ, ABXY ? X B* BBYB, BB *B* BBYB, BB, ABXYZ, ABXY ?B?? BBYB, ABXY 21 2.9 Options Several global options are available to control the amount of printing, depth of analysis, etc. These options are enabled and disabled by the ON and OFF commands. For example, the command ON PLOT will cause residual plots to be produced when an equation is estimated. A complete list of available options with current settings will be displayed by SORITEC if an ON command is entered with no arguments. Global options in SORITEC Sampler with their default settings are described in Appendix II. After every ON or OFF command which changes an option, an inter- nal result called ^FLAGS is stored as a vector. ^FLAGS contains information on the global options which are in effect immediately after the ON or OFF command is executed. It can be RECOVERed, retained in SORITEC's workspace or stored in SORITEC databanks, and can later be used to restore global options to settings that were in effect when they were recovered. Global options are restored with the FLAGS command, which has the format: FLAGS flag_vector The argument "flag_vector" is the name of the vector to which the RECOVERed SORITEC internal variable ^FLAGS has been written. Note that flag vectors must not be changed in any way, or unpredictable results may occur. The FLAGS command exists solely to restore previous global option settings. Furthermore, the ordering and number of the global options is subject to change in future releases so flag vectors stored on SORITEC databanks may not restore the options desired if retrieved by a later release of SORITEC. 2.10 Recovering Internal SORITEC Variables The RECOVER command allows the user to access and manipulate secondary results which have been generated and stored under internal names by SORITEC commands. Either one or two arguments are associated with the command, which has the syntax: RECOVER [name] internal_name The "internal_name" is an internal system name which identifies which secondary result to RECOVER from SORITEC for later use. Legal system names of secondary results that can be recovered are given in Appendix I. The first argument, "name", is optional and is a user-defined name assigned to the recovered item. If omitted, the recovered name is identical to the internal system name. In addition to the RECOVER command, SORITEC allows you to directly reference internal system names by prefixing an up-carat (^) to the variable name. For example, the commands: 22 RECOVER fitted_values yfit and fitted_values = ^yfit would both recover the fitted values of the dependent variable and copy them into the variable named "fitted_values". SORITEC internal system names can be referenced directly in most situations. For example, parame- ters and time-series variables that are internal system names can be reas- signed with the SET command and can be referenced in transformation opera- tions. Equations, matrices, vectors and GROUPS can also be referenced. However, reassignment still requires the use of the RECOVER command. Internal names cannot be saved to a databank without being reassigned to another variable. SORITEC will not confuse its own internal system names with variables or other identically-named data items that the user has defined in his/her program. The type of the first argument (variable, vector, constant, or other SORITEC form of data organization) is automatically defined or rede- fined to the type required by the second argument. Secondary results need not be recovered immediately. All such results remain available until a command is executed which stores other results under the same internal system name. In that event, the prior results held under that internal system name are lost. Note that some intermediate results are retained under internal system names only if the user sets appropriate flags with the ON command. Check the default switch settings associated with each command to ensure that intermediate results are auto- matically saved. Note that some intermediate results are retained only if the appropriate flags are set by the ON command. The internal variables and global options that enable them are: Internal Description Flag Setting Name to Save Value ^CCOR Coefficient Correlation Matrix OFF NOMATS ^VCOV Coefficient Covariance Matrix OFF NOMATS ^XTABLE Crosstabulation Table OFF NOMATS ^RAWEQ Raw Forecasting Equation ON RAWEQ See Appendix II for the default values for these options. 2.11 SORITEC'S Symbol Table Any time during an interactive or batch session you can determine what item names are currently active in SORITEC's workspace by examining the symbol table. SORITEC's symbol table is listed on the output device when the command: SYMBOLS [ALL] is entered. The symbol table lists each item's name, storage address, item type and length. Including the optional keyword "ALL" in the command line 23 print all currently active SORITEC internal names in addition to user- defined items. SYMBOLS accepts wildcards so that a selective search of the symbol table can be made. Items can be removed from SORITEC's symbol table by invoking the FORGET command which is of the form: FORGET [item_1] [item_2] ... [item_n] Each "item_i" is a currently active item in SORITEC's workspace, as identi- fied from the symbol table. The command may have up to 100 arguments. FORGET accepts wildcards so that selected items from the symbol table can be removed. For example, FORGET ab* removes all items that begin with the characters "ab" from the symbol table. All items from the symbol table are removed by entering the wil- dcard symbol "*" in place of item names, i.e., FORGET * FORGET does not affect the contents of attached SORITEC databanks nor does it return databanks. Note that FORGET erases item names from the SORITEC symbol table but does NOT remove the data from the workspace. If you exceed the workspace limitation, FORGETting items from the symbol table will not free the stack space they occupied. You must QUIT the current session and re-invoke SORITEC from DOS to free the workspace. 2.12 Minor Control Statements Several commands alter default settings other than those identified with global options (ON/OFF) or pass information to SORITEC for use in output listings. 2.12.1 Specify Width of Output Device The width of output from SORITEC Sampler can be adjusted using the WIDTH command, i.e., WIDTH number The argument, "number", must be a numeric value between 50 and 150. Argu- ments outside this range will generate an error message, leaving the pre- vious WIDTH definition intact. The default value for interactive usage is 80 characters; in batch mode, the default value is 132 characters. 2.12.2 Change Length of Input Line The length of the input line that SORITEC Sampler can accept may be changed by the SCAN command, which has the format: 24 SCAN number The argument, "number", must be a numeric quantity between 50 and 150. Arguments outside this range will cause an error message and the existing SCAN will remain in effect. The default value for scan in interactive and batch modes is 80 characters. 2.12.3 Reset Maximum Error Limit The maximum error limit can be reset in SORITEC batch jobs to alter the number of NONFATAL and SERIOUS errors a job can commit before the batch processor abandons compilation and execution. The syntax of the command is: MAXERR number where "number" is a numeric quantity that defines the new error limit. The default setting for MAXERR is 25. 2.12.4 Turn Batch Listing On or Off Listings of batch job commands are turned on or off by the ONLIST and OFFLIST commands, respectively. The default setting is OFFLIST. 2.12.5 Label Batch Output Pages Up to 120 characters of label information can be printed to a SORITEC Sampler batch job listing using the TITLE command. The syntax of the command is: TITLE [label] The output, "label" will appear on the third line of each output page, following the JOB statement. A TITLE command with no argument causes the third line of succeeding pages to be blank. Title labels may not contain the symbols ; , $ , or &. As many TITLE commands as needed can be placed in a job. They are executed as they are encountered in the job stream and label all succeeding pages until another TITLE command is executed. 25 Chapter 3 Data Entry and Output 3.0 Introduction Data may be imported or exported to or from SORITEC Sampler in several formats, including SORITEC Alternate Load (SAL) files, DIF files, FORTRAN formatted files, SORITEC Database Files (SDB), and keyboard entry. In addition, data may be displayed at the terminal either in tabular or graphical format. This section describes the available data input and output options with detailed descriptions of the syntax and examples. The most common mistakes that users make with data entry are (a) forgetting to move the file into the current working directory, (b) forget- ting to add the correct file extension to the file when it is created, or (c) using a file extension in SORITEC. In the latter case, SORITEC Sampler always appends the appropriate file extension to the file name so that you need not specify the extension in SORITEC Sampler file manipulation com- mands. If you specify a SAL file as READ(MYFILE.SAL), SORITEC Sampler will look for MYFILE.SAL.SAL. On the other hand, READ(MYFILE) will not execute if you have forgotten to append a .SAL extension to the name of the stored DOS file that you want to read. 3.1 SORITEC Alternate Load (SAL) Files SAL files are the easiest way to import large amounts of data into SORITEC Sampler. They are also a convenient means of exporting data, particularly if you want to move data to SORITEC on another (non-DOS) computer. SAL files are essentially free-field ASCII files with a special header. If you already have data in a tabular format, you can quickly create a SAL file by editing the table with any standard text editor or word processor. SAL files are composed of three parts, (1) the header, (2) the data, and (3) the data terminator. The header conveys information necessary for SORITEC Sampler to correctly read the data. Two commands are used to define the header sec- tion. The first is the USE period which tells SORITEC Sampler what time period the data spans. This is followed by the READ command which tells SORITEC Sampler what variable name to assign to the data. The final item following the data is a ';' that delimits each data section in a data file. The final line in a SAL file is an END statement that tells SORITEC Sampler to expect no more data for the READ statement being executed. An example demonstrates the structure of a SORITEC SAL file. We wish to import the following data into SORITEC Sampler: 26 YEAR GNP TAXES PRIME 1970 1423.5 455.6 10.75 1971 1564.2 678.3 9.76 1972 1688.9 778.4 13.45 The following file, named MACRO.SAL (SAL files must end with a .SAL file extension), is a valid SORITEC SAL file. USE 1970 1972 READ GNP TAXES 1423.5 455.6 1564.2 678.3 1688.9 778.4 ; READ PRIME 10.75 9.76 13.45 ; END SAL files can contain any number of data series. Furthermore, data sec- tions (sections of a SAL file delimited by an END statement) can be stacked as necessary and imported using multiple reads (or exported using multiple writes) in SORITEC Sampler. More than one variable can be input with a single READ. The USE period can be changed as often as necessary to conform to the data. 3.1.1 SAL File Input SAL files are imported into SORITEC Sampler using the READ command which has the format: READ(filename) As the USE period and all variable names are already predefined in the SAL file headers, no further information is needed. If referenced simply as above, the SAL file, "filename" must exist in the current directory with the filename "filename.sal". If the SAL file exists on a drive or directory other than the current one, it must be referenced within single quotations, i.e. READ('d:filename') or READ('\path\filename') A READ command imports data from a SAL file until it encounters an END statement. A later READ of the same file would then begin importing data following this delimiter until the next END statement is reached, and so on. No section of a SAL file can be re-read, since the file is sequen- tially organized. 3.1.2 SAL File Output Data may be exported from SORITEC Sampler in SAL file format using the PUNCH command. The format of the PUNCH command is: PUNCH series_1 series_2 ... 27 PUNCH creates a SAL file named PUNCH1.SAL in the current directory or drive. Salfile data may not be directed to any other file name from within Sampler. Before writing data to a SAL file the desired USE period MUST be in effect, and data series on the command line must be of the same periodici- ty. SORITEC Sampler appends the extension .SAL to the file when it is opened. If PUNCH1.SAL already exists in the directory, SORITEC Sampler will over-write the existing file with the new one. Note that SAL files remain open until closed by a QUIT command. Multiple PUNCH commands to the same file will therefore append the data to the referenced SAL file. An END delimiter is appended to the file when it is closed. 3.2 Data Interchange Format (DIF) Files The Data Interchange Format (DIF) file format has emerged as a de- facto standard for exchanging data between popular PC packages such as LOTUS 1-2-3, DBASE II, SUPERCALC, and various stand-alone graphics packages. Because of this, SORITEC Sampler has been equipped with DIF file input and output facilities. The format of the DIF commands is subject to change in future SORITEC releases. 3.2.1 DIF File Input SORITEC Sampler imports DIF files through the READDIF command. There are two forms of the READDIF command. If variable names are in the DIF file, then the command is simply: READDIF(filename) If variable names are not in the DIF file, the command line is: READDIF(filename) series_1 series_2 ... SORITEC Sampler supports subdirectory addressing within the filename reference. If the DIF file exists on a drive or directory other than the current one, it must be referenced within single quotations, i.e. READDIF('d:filename') [series_1 series_2 ...] or READDIF('\path\filename') [series_1 series_2 ...] READDIF does not read dates in DIF files so an appropriate USE period must be in effect before the command is executed. At this writing, READDIF expects to find ONLY time-series data in the input DIF file. Any spreadsheet cells that do not contain legal num- bers are interpreted as 'MISSING' values by SORITEC Sampler. As a consequence, SORITEC-generated DIF files that contain data other than time-series and that are later read by SORITEC Sampler will NOT generally produce useful results. 28 There are two ways that data can be organized in LOTUS to pass it to SORITEC Sampler: with and without labels. In either case, the data are interpreted under the currently active USE period in SORITEC Sampler. The USE interval is never derived from a DIF file's contents. If the columns are to be labeled, the names must appear in ROW 1 and if the rows are to be labeled, the names must appear in COLUMN A. For example, if the following worksheet is written to 'NATIONAL.DIF' using the LOTUS translate function: A B C D +------------------------------------ 1 | GNP TAXES PRIME 2 | 1423.5 455.6 10.75 3 | 1564.2 678.3 9.76 4 | 1688.9 778.4 13.45 then NATIONAL.DIF can be read into SORITEC Sampler using the commands: USE 1970 1972 READDIF(NATIONAL) READDIF can read variable names up to 32 characters in length. The unlabelled method is less convenient because correct variable names must be specified in the READDIF command. In the following example, READDIF assumes that the desired variables are stored in column order. If column D was not empty and the USE specified four observations, then the data would be interpreted in row order. The following table written from LOTUS to the file NATIONAL.DIF: A B C D +------------------------------------ 1 | 1423.5 455.6 10.75 2 | 1564.2 678.3 9.76 3 | 1688.9 778.4 13.45 can be read into SORITEC Sampler with the commands: USE 1970 1972 READDIF(NATIONAL) GNP TAXES PRIME with the same results as in the labelled example. Input data outside the current USE interval are ignored. If insuffi- cient data exist to satisfy the current USE period, the remaining observations are set to 'MISSING'. READDIF tries to do something reasonable with any input DIF file by first considering the current USE interval, then examining the DIF file contents. One should spot- check READDIF input results to ensure that the rows and columns are inter- preted as intended. 29 3.2.2 DIF File Output DIF files may be exported from SORITEC Sampler using the PUNCHDIF command. This command has the format: PUNCHDIF[(filename)] arg_1 arg_2 arg_3 ... arg_n where the arguments may be time-series, parameters, constants, vectors or matrices. Variable names in the argument list can be no longer than 10 characters. Otherwise, longer names are truncated. SORITEC Sampler creates a file called 'filename.DIF' which can be translated into a LOTUS worksheet using LOTUS' translate utility. If the filename is omitted, SORITEC Sampler creates a file named PUNCH1.DIF. You can redirect DIF file output to a file on another drive or directory other than the current one using the same conventions as the READDIF command. Note that the following rules apply: (1) Only observations active under the current USE com- mand are written to the file. (2) PUNCHDIF re-orders its arguments (if required) so that all SERIES are written first, followed by CONSTANT items, and lastly, VECTOR items. (3) PARAMETERS are output as CONSTANTS. (4) MATRICES are output as VECTORS with M * N elements. (5) SORITEC 'MISSING' values are output as 'NA'. Most of these considerations are demonstrated by the following example: USE 1984Q1 1984Q3 FILL GNP 1423.5 1564.2 1688.9 FILL TAXES 455.6 678.3 778.4 FILL PRIME 10.75 9.76 13.45 SET CONST=35. CONSTANT CONST2 223 PARAMETER C3 VECTOR VVV 1 2 3 VECTOR V2 4 3 2 1 USE 1984Q2 1984Q4 PUNCHDIF(ADIFFILE) V2 VVV C3 CONST2 CONST & GNP TAXES PRIME 'ADIFFILE.DIF' is created and results in the following spread- sheet after being read into LOTUS 1-2-3: 30 A B C D E F +-------------------------------------------------------- 1 | TIME GNP TAXES PRIME 2 | 1984Q2 1564.2 678.3 9.76 3 | 1984Q3 1688.9 778.4 13.45 4 | 1984Q4 NA NA NA 5 |CONSTANT C3 0 6 |CONSTANT CONST2 223 7 |CONSTANT CONST 35 8 | VECTOR VVV 1 2 3 9 | VECTOR V2 4 3 2 1 3.3 Formatted Input and Output SORITEC Sampler supports formatted input and output of data and text. The command syntax for formatted I/O is similar to FORTRAN formatted I/O. In other words, the read or write statement refers to a FORMAT statement number that contains the format for the input or output. The FORMAT command has a statement number, the command name FORMAT and a legal format specification, i.e., statement_number FORMAT format_specification The statement_number is always a positive integer between 1 and 9999. It must be unique within any given session or batch job. In other words, once a FORMAT is entered and identified by a statement number, no other command can have the same command number during that session. Allowable "format_specifications" are identical to those permitted in FORTRAN programs. Consult any FORTRAN reference manual for details on FORMAT statements. 3.3.1 FORTRAN Formatted Input Although free-format SAL files are the preferred way to import data to SORITEC Sampler, there may be occasions when data are structured so that it is necessary to use an explicit format statement. Standard FORTRAN-style format statements are used. Sampler can read formatted data directly from the terminal or from a file. The syntax for reading formatted data is: READ([filename] [statement_number]) series_1 series_2 ... Here, the "statement_number" refers to a previously defined format state- ment. The optional data file identified by "filename" must have a .SAL file extension. If omitted, SORITEC Sampler reads the data from the current input device, i.e. the terminal or a SAC file if a command file is being executed. If the format statement number is omitted, data are assumed to be free-formatted. Input file redirection is supported by the READ statement so that you can read a formatted file from a drive or directory other than the current one if it is referenced within single quotations, i.e., 31 READ('d:filename' statement_number) series_1 series_2 ... or READ('\path\filename' statement_number) & series_1 series_2 ... Unlike regular SAL files, formatted files cannot be read by multiple READ statements; all data from the file must be imported at one time. Normally, formatted READ commands expected data to be organized in columns. However, if the STREAMIO option is enabled by the ON STREAMIO command, data can be read by rows. For example, to read the text file MACRO1.SAL, including the headers, given below: KEY MACROECONOMIC INDICATORS 1970 1971 1972 GNP 1423.5 1564.2 1688.9 TAXES 455.6 678.3 778.4 PRIME RATE 10.75 9.76 13.45; the following command sequence would be required: ON STREAMIO USE 1970 1972 101 FORMAT(///10X,3F8.1) READ(MACRO1 101) GNP 102 FORMAT(10X,3F8.2) READ(MACRO1 102) TAXES READ(MACRO1 102) PRIME Although this is almost as straightforward as for standard SAL file input, a FORMAT statement used and reference to the FORMAT statement number is made in the READ statement. Also unlike standard SAL file reads, you must explicitly reference the variable list in the READ statement and the USE period must be set in the main program before the READ command is executed. The file must still be terminated with a ";" delimiter. 3.3.2 FORTRAN Formatted Output Data and text may be printed in a prespecified format by the WRITE command. FORTRAN formatted output can be directed to either the terminal or a file. The general format for the formatted write command is: WRITE([filename] [statement_number]) var_1 var_2 ... The statement number refers to a previously defined FORMAT statement. If the optional "filename" is included, SORITEC Sampler writes the data according to the format statement associated with "statement_number" to the file "filename.LST". Otherwise, the data are written to the terminal or the current output device if DOS redirection has been invoked. If the statement number is omitted, data are printed in a list format similar to the format used to PRINT variables at the terminal, e.g., 32 VAR_A ................ . 1 . 1.00000 2 . 2.00000 3 . 2.50000 4 . 3.50000 5 . 5.00000 Variables in the variable list may be time-series, constants or parameters. Up to 100 variables are allowed in a variable list. When time-series or vectors are encountered in the variable list, SORITEC Sampler writes all active observations to the terminal before writing the next variable in the list. Placing parentheses around time- series variables in the variable list, however, will direct SORITEC Sampler to print one value from each variable in turn, allowing you to print time- series in columns. WRITE([filename] statement_number) constant_1 & (time_series_1 time_series_2) constant_2 For example, the commands: USE 1973Q1 1973Q4 102 FORMAT(15X,' GNP CONSUMPTION INVESTMENT'//10X,(3F11.1)) WRITE(102) (gnp consump invest) produce the following output. GNP CONSUMPTION INVESTMENT 475.7 301.4 71.0 468.3 306.2 70.1 487.7 312.8 82.3 490.7 320.8 65.6 Constants and parameters cannot be included in parentheses. 3.4 Keyboard Entry Data may be entered directly from the keyboard using the FILL command, which has the format: FILL variable_name value_list where "value_list" is the set of values assigned to the variable "variable_name". For example, FILL VAR_A 1 4 2 5 7 8 creates a new series VAR_A with the six specified values. When there is no USE command in effect, a FILL command counts the data items, stores them as undated data and defines an appropriate USE interval 33 which is assumed in later commands or until the USE period is redefined. If there are too many or too few observations entered for the current USE period, an error message is generated unless the ON RAGGED option is enabled. The option command ON RAGGED permits entry, through FILL, of data series that are shorter than the current USE interval without generating an error. Unaccounted data are assigned MISSING values when this condition is encountered. FILL will not accept data series longer than the current USE period under any circumstances. FILL is commonly used to enter data series that consist of few observations or to extend current data series. 3.5 Output of Data to the Terminal Data may be output to the terminal in both tabular and graphical form. If necessary, tables and graphs can be routed to the printer by using the DOS "Ctrl-P" switch before entering the appropriate command. 3.5.1 Tabular Display The simplest data display is produced by the PRINT command. Any data series, vector, constant, parameter, equation or GROUP can be displayed using this command, which has the form: PRINT arg_1 arg_2 arg_3 ... Types of arguments to be printed may be mixed, but this is generally inadvisable. Since SORITEC does not put unlike items on the same lines, mixing types or periodicities indiscriminately can generate lengthy out- puts. The PRINT command can have up to 100 arguments, each of which must be a legal SORITEC name. Lagged variables may be specified in a PRINT command. To display data from the members of a GROUP, the ON GROUP option must be active. PRINT displays the names of GROUP members if OFF GROUP is enabled. Data may be output to the terminal in specified formats and mixed with text using the WRITE command. Refer to Section 3.3.2 for a description of this command. 3.5.2 Graphical Display Two types of graphical displays are available from SORITEC Sampler. Both produce line printer-style graphics. SORITEC's estimation commands can also produce medium resolution residuals plots on systems with color graphics capability. These are discussed in Section 10.2.11. Multi-variable plots of time-series or cross-section data are generated by the PLOT command, which has the form: PLOT series_1 symbol_1 series_2 symbol_2 ... The PLOT command produces a line printer plot of observation number against up to nine variables at once. Plotting symbols must be specified in the command line for each variable to distinguish plotted values. Plotting 34 symbols may be alphanumeric (A-Z, 0-9) or the characters +, -, * , /, =. If two variables, at some observation, are nearly equal so that they occupy the same position on the screen, only the symbol for the latter- named variable is displayed. The horizontal scale is determined automati- cally so that all data values can be plotted. The WIDTH command can be used to inform SORITEC Sampler that more (or less) than 72 characters can be output on a single line. In this case, the width of the plot is adjusted accordingly, e.g., WIDTH 132. To generate meaningful output, all plotted variables should have roughly the same range of values. Otherwise, some multiplicative or addi- tive scaling may be necessary. 35 The relationship between two variables can be illustrated graphically via the SCATTER command, which is specified as: SCATTER series_1 series_2 SCATTER generates a scatter diagram with the variable referenced in the first argument plotted with respect to the vertical or Y-axis and the variable referenced in the second argument plotted against the horizontal or X-axis. Lagged variables are permitted. The graph size is dependent upon the number of characters that can appear on a line. The default value is 72 but can be changed by the WIDTH command. 3.6 SORITEC DataBank Files SORITEC DataBank (.SDB) files are the most convenient means of acces- sing data AFTER the data have been entered into SORITEC Sampler. The databanking facility has its own set of commands for accessing and managing data. These commands are described in the next chapter. 36 Chapter 4 SORITEC DataBank (SDB) Files 4.0 Introduction SORITEC databanks are the key to using SORITEC Sampler efficiently. SDB files can store data series, equations, matrices, vectors, scalars, parameters, namelists and multiple equation models. SORITEC Sampler can store an unlimited number of items if enough disk space is available. Planned future enhancements include the ability to store and recall user procedures, report formats, data descriptors and online "HELP" text. SDB files are constructed in a "knapsack" database arrangement. In effect, you can throw anything you want into an SDB file and the recall it by name later. There is no need to specify the type of the data item, its length, etc; SORITEC Sampler keeps track of that for you. The commands necessary to create and manipulate SDB files are straightforward and easy to learn. The complete list is as follows. 4.1 Create a Databank CREATE constructs and initializes a SORITEC databank. The only argu- ment in the command line is the name of the database that you want to create. For example, CREATE filename will create a file called filename.SDB for future use. The CREATE command creates the databank on the default drive and directory. However, the file can be created on an alternative drive or directory by enclosing the drive specification and filename in single quotations, e.g. CREATE 'd:filename' or CREATE '\path\filename' Once the database is created, it remains open for I/O until either (a) a different database is accessed, (b) the file is RETURNed, or (c) SORITEC Sampler is terminated. 4.2 Access a Databank ACCESS opens a SORITEC databank for use in the current job session. The general form of the command is: ACCESS filename 37 The database must already exist in the current directory as "filename.SDB" or an error message is generated. Once a database is ACCESSed, SORITEC Sampler automatically copies the requested data items referenced in a command into the workspace if it is not already there. ACCESS automatical- ly returns any database which is currently open. Databanks residing on drives other than the current drive may be referenced by enclosing the drive designation and filename within single quotation marks, as noted above for CREATE. Depending on the implementation, there may be additional arguments to the ACCESS command to specify special file formats (CitiBase for example), passwords or read/write access. 4.3 Release a Databank from SORITEC RETURN automatically closes any database which is currently open and releases it from SORITEC's control. The format of the command is: RETURN No arguments are required with this command as only currently ACCESSed databank is referenced. After the RETURN command, the database is no longer accessible until another ACCESS command is executed. 4.4 Purge a Databank Databanks may be purged from the DOS directory with the PURGE command. The format of the command is: PURGE filename Since the database is permanently erased, this command should be used with care! PURGE only works on SORITEC databases so it isn't possible to delete an arbitrary file using this command. Reference to a database on a direc- tory or drive other than the current one follows the same rules as the CREATE and ACCESS commands. 4.5 Retrieve Items from a Databank into the Workspace Data are explicitly copied from the currently accessed databank into the workspace by the COPY command. The command syntex is: COPY item_1 item_2 ... item_n Arguments in the command line may be time-series, constants, parameters, vectors, group names, and equations. Since the databank is always implicitly searched for items needed by SORITEC commands, this command is generally used only when you need to retrieve data from a second database. If, for example, you wish to regress a measure of inflation, such as CPI, stored on one database, against some measures of final demand, such as PCE 38 and DEFENSE, stored on another, the command sequence would be: ACCESS inflate COPY cpi ACCESS fdemand REGRESS cpi pce defense 4.6 Store Items in a Databank Items in SORITEC's databank are stored on the currently-accessed databank with the KEEP command. The syntax of the command is: KEEP item_1 item_2 ... item_n Each argument, "item_i", can be a data series, constant, parameter, equa- tion, vector, group name or equation. If you try to KEEP an item that has the same name as an item that already exists in the database, a non-fatal error is reported and the item is not replaced. There are three ways to replace an item that already exists on a SORITEC databank. First, the item stored in the databank can be explicitly discarded using the DISCARD command and then stored using the KEEP command. Second, the item can be replaced explicitly with the REPLACE command. Lastly, items in in a databank can be implicitly replaced with the KEEP command if the ON REPLACE option has been enabled. KEEP stores all observations associated with a given time-series, regardless of the observation period, as defined by the current setting of the USE command, that is currently active. For example, if the series GNP is defined for 1950Q1 to 1984Q2 and the current USE period is for 1980Q1 to 1983Q4, the command KEEP GNP stores the series for 1950Q1-1984Q2. You may save only the active observations by entering the command: KEEP(ACTIVE) item_1 item_2 ... item_n 4.7 Replace Items in a Databank Items in databanks are replaced by items of the same name in the current workspace with the REPLACE command. The command syntax is: REPLACE item_1 item_2 ... item_n If the item is not currently stored on the database, a warning message is generated but the item is still saved. 4.8 Rename Items in a Databank The names of items in a SORITEC databank are changed with the RENAME command, which has the form: RENAME new_name_1 old_name_1 new_name_2 old_name_2 ... 39 RENAME takes an even number of arguments consisting of pairs of item names. The command renames item old_name_i to new_name_i. Note that the ordering of the pair is new_name, followed by old_name, which is reversed from argument orders usually found in SORITEC. 4.9 Switch the Names of Two Items in a Databank Pairs of items in a SORITEC databank can have their names swapped by the SWITCH command. The syntax of the command is: SWITCH item_1 item_2 It is equivalent to the series of commands: RENAME temp item_1 RENAME item_1 item_2 RENAME item_2 temp. 4.10 Discard Items from a Databank Items are erased from a databank with the DISCARD command. The format of DISCARD is: DISCARD item_1 item_2 ... item_n Once DISCARDed, the item is irretrievably lost. 4.11 Generate a Directory Listing of a Databank An alphabetically sorted directory listing of a SORITEC databank is produced with the CONTENTS command, which has the form: CONTENTS [filename] If "filename" is omitted from the command line, SORITEC Sampler produces a directory listing of the currently active databank. If no databank is active, an error message is returned. The optional argument "filename" is the name of a SORITEC database in the current directory. Reference to a database on a directory or drive other than the current one follows rules similar to the CREATE, ACCESS, and PURGE commands. Note that the command: CONTENTS filename attaches the named databank after returning the one currently attached. To reference the previous databank, you must re-attach it with the ACCESS command. 40 Chapter 5 Programming Constructs 5.0 Introduction SORITEC provides a powerful interpretive programming language that enables the user to simplify complex and repetitive estimation procedures into a smaller set of commands that can be executed interactively or through SORITEC's batch processing facility. SORITEC's programming lan- guage supports numeric and alpha looping, and conditional and unconditional transfer of control to other statements. When set up as a SORITEC Alterna- tive Command (SAC) file, this programming language provides a convenient means for developing more complex estimators and diagnostic statistics in addition to those provided directly by SORITEC Sampler. The alternate command file facility enables command files to call other command files so that a series of command sequences can be executed. Note that command files can be chained together but they cannot be nested. This means that program control does not implicitly return to the command file from which the call was made. SORITEC also provides a PROCEDURE facility that allows you to structure a sequence of commands into a subprogram that, once defined, can be passed arguments and repetitively called, like a subroutine, from a SORITEC command line. The PROCEDURE facility is not available in SORITEC Sampler. The commands associated with SORITEC Sampler's programming language follow. 5.1 Numeric Looping Repetitive execution of commands in SORITEC Sampler is accomplished by DO loops. The DO loop has the following general format: DO index = beginning_value TO end_value BY increment . . (SORITEC Sampler commands) . . END The DO loop index, beginning_value, end_value and increment may be integer or real scalars or parameters and you can proceed forward or backward through the loop by assigning a positive or negative value to the incre- 41 ment. Both the end_value and increment may be reset dynamically within the loop. If so, the new values are used to determine whether the loop is executed again. If the BY increment is omitted from the DO command line, it is set to 1. A DO command, with no specified values for "beginning_value", "end_value" and "increment", will cause the statements before the END command to be executed once. If the DO variable's initial value exceeds its maximum value before a positive increment is added, an error message is generated and the state- ments between the DO and END statements are not executed. The same situa- tion results if the variable's initial value is set lower than a final value to be reached by negative increments. You can construct a DO loop to index through members of a group. For example, the commands: GROUP group_name series_1 series_2 ... series_n ON GROUP DO i = 1 TO n REGRESS y group_name(i) END would regress the dependent variable "y" against each of the time-series in the group "group_name" successively. 5.2 Unconditional Branching SORITEC Sampler allows you to transfer control to any command prefixed by a statement number. The format of the command is simply: GO TO statement_number Alternatively, the command may be specified as GOTO. Statement numbers may be numbers, CONSTANTs or PARAMETERs and must be in the range 1 to 9999. They may be prefixed to most commands and FORMAT statements, but not GO TO statements. Other commands that may not be prefixed are: JOB ONLIST HELLO OFFLIST SCAN MAXERR WIDTH COMMENT In batch mode, if the specified command number does not exist, an error message is generated, and control passes to the statement which follows the GO TO command. In interactive mode, the system responds with a query for the missing statement number until the statement number is entered. 42 5.3 Conditional Branching Conditional branching is enabled through an IF/THEN/ELSE command structure. The general format for the command sequence is: IF condition; THEN; command_sequence_1; ELSE; command_sequence_2 A "condition" must be an arithmetic expression that may include logical and relational operators, as needed. When the condition is satisfied, control transfers to "command_sequence_1", otherwise control is transferred to "command_sequence_2". The IF/THEN/ELSE sequence MUST be delimited by semi- colons, as specified above. An IF/THEN/ELSE command structure CANNOT be nested. Command sequences in conditional branching statements may be composed of a single command or a series of commands. If more than one command comprises a command sequence, they must be structured in a DO loop, e.g., IF a > b; THEN; DO c = b * log(a) print a b c END; ELSE; DO c = a * log(b) plot a # b * END Obviously, a DO loop in an IF/THEN/ELSE sequence can be executed repetitively by specifying the index, initial value, final value and, optionally, the increment in the DO command line. Either the THEN or the ELSE clause may be omitted from a conditional branching command sequence. The IF command can also be used with the GO TO command to control the order of execution, e.g. IF x < y .and. a > b; THEN; GO TO 300 5.4 Null (Continuation) Statement The CONTINUE statement is generally used in SORITEC Sampler to posi- tion a statement number within a SORITEC program. Its syntax is: statement_number CONTINUE As such, it is not executed. 5.5 Alpha Looping SORITEC Sampler will repetitively execute a sequence of commands by indexing over a set of alphabetic loop control variables. On each pass through the loop, SORITEC Sampler supplies succeeding alphabetic arguments in the DOT statment. The DOT statement is functionally similar to a DO 43 command. The format of the command is: DOT variable_1 variable_2 ... variable_n . . (SORITEC Sampler commands) . . . ENDDOT Alpha loop control variables are successively entered into expressions within the DOT loop by substituting all references to any colons (":") within the DOT loop by the currently active alpha variable, i.e., DOT a b c REGRESS y a REGRESS y : is executed as REGRESS y b ENDDOT REGRESS y c You may also use the colons as suffixes to construct new variables within DOT loops, e.g., DOT var1 var2 var3 outvar1 = inpvar1 * z out: = inp: * z is executed as outvar2 = inpvar2 * z ENDDOT outvar3 = inpvar3 * z The colon may not be used as a prefix, however. All commands in the DOT loop are executed as many times as there are variables in the DOT command. Note that if group expansion is enabled by the ON GROUP switch, a DOT loop can index through a GROUP, i.e. GROUP group_name var_1 var_2 var_3 ... ON GROUP DOT group_name regress y : ENDDOT would regress the dependent variable, y, against each of the time-series in the GROUP "group_name". 44 Chapter 6 Dummy Data Series Generation and Special Transformation Commands 6.0 Introduction SORITEC Sampler provides several commands that generate or transform time-series. These commands create dummy variables or they transform existing data series into new time-series. They include facilities for converting time-series from one periodicity to another and for transforming continuous into discrete variables. SORITEC Sampler also provides com- mands that compute modular division and invoke maximum and minimum functions. 6.1 Create a Time Trend Dummy Series SORITEC Sampler generates a time trend dummy series with the TIME command. The syntax of this command is: TIME [series_name] TIME sets the first observation of the "series_name" associated with the currently active USE period equal to one and increments successive observations by one, so that the second observation is set to two, the third to three, etc. If the "series_name" is omitted from the command line, TIME stores the time trend dummy in a series named "time". If a variable by that name already exists in the workspace, it will be overwrit- ten by the TIME command. The TIME command may only be invoked when there are no internal gaps in the current USE period, i.e., the current USE period must have been invoked with only two arguments. 6.2 Create Seasonal Dummies A periodic dummy variable can be created using the DUMMY command, which has the form: DUMMY output_series first_observation skip_increment In the command line, "first_observation" is the first observation set to one. Series elements are then set to one every "skip_increment. The remaining values of the series are set to zero. 45 6.3 Recode a Variable SORITEC Sampler allows you to convert a continuous variable into a discrete variable via the RECODE command. The form of the command line is: RECODE output_series input_series p(1) p(2) p(3) p(4) ... In the above command line, "input_series" is the series to be recoded and "output_series" is the categorized output variable. The p(i) are the interval boundaries for the recoding process. To show the RECODE function, the commands: FILL a 3 17 21 28 31 35 26 41 RECODE b a 10 20 25 30 35 40 PRINT a b produce these results. A B 1 3 0 2 17 1 3 21 2 4 28 3 5 31 4 6 35 5 7 26 3 8 41 6 For each element, i, of the series, RECODE uses the following formula: output_series(i) = k if p(k-1) =< input_series(i) < p(k) when p(k-1) <> p(k), and output_series(i) = k if p(k-1) = input_series = p(k) p(0) is always considered to be -infinity, and p(n+1) (where n is the number of p(i) in the command) is always considered to be +infinity. 6.4 Conversion of Time-Series from One Periodicity to Another The periodicity of dated and undated time-series is converted by SORITEC Sampler with the CONVERT command. The command has the following syntax: CONVERT [(modifier)] output_series = input_series When the command is executed, data of one periodicity are converted to the periodicity specified by the current USE statement. In other words, the periodicity of the "input_series" does not have to be explicitly specified, since SORITEC Sampler determines it internally. Lags are not allowed in CONVERT arguments and the entire series is always converted, regardless of the range specified in the USE command. 46 While the standard syntax of the convert command requires the specifi- cation of both an output (result) series and an input series, the converted series can be written to the input series name simply by specifying: CONVERT [(modifier)] input_series After the conversion, the old values of the input series, in the old periodicity, are lost. The modifier argument in the command line is optional, and controls the type of conversion which takes place. There are two sets of modifiers, one for aggregation (such as monthly to annual), and one for disaggregation (such as annual to monthly), plus a special MOVE modifier for converting to and from undated data. The modifiers are: AGGREGATION SUM Sum observations in each period (default) AVERAGE Average observations in each period MIN Find the minimum observation in each period MAX Find the maximum observation in each period LAST Use the last observation in each period DISAGGREGATION FILL Use the data point for entire period for each sub-period SHARE Divide the data value for the entire period equally across all sub-periods (default) UNDATED TO DATED CONVERSIONS MOVE Move the data from and undated to a dated variable or vice versa without alteration (default) Modifiers do not have to entered into the command line explicitly if the default is selected. Conversion is currently permitted only between annual, semi-annual, quarterly, monthly, ten-day and undated data types. In addition, conver- sion from monthly to ten-day periodicity produces incorrect results because of the way the ten-day data type is defined. See Section 2.4 for information on data types supported by SORITEC. 6.5 Maximum Function SORITEC Sampler can determine the maximum of a series or can generate a new series from several containing the maximum value associated with each observation. The maximum value of a series is found by entering the MAX command with only two arguments, i.e., MAX maximum_value input_series When entered like this, "input_series" is the data series over which the maximum is to be taken. The result is stored in "maximum_value" which must 47 be a CONSTANT or PARAMETER. If the "maximum_value" name is undefined prior to entering the command, SORITEC Sampler defines it to be a CONSTANT. A new series consisting of the set of maximum values, by observation, associated with several series is generated by the MAX command when more than two arguments are entered in the command line, i.e., MAX output_series input_series_1 input_series_2 ... In this case, all arguments in the command line must be data series. The resulting "output_series" contains the observation-by-observation maximum of all the remaining arguments. Up to 99 input series can be evaluated by this command. 6.6 Minimum Function The minimum value of a data series or a series of minimum values, by observation, of several series is obtained using the MIN command. The format and use of MIN is identical to the MAX command except for the result it computes. In other words, the minimum value of a data series is determined when the MIN command is followed by two arguments: MIN minimum_value input_series where the first argument is a CONSTANT or PARAMETER and the second is the series you wish to evaluate. A series containing observation-by-observation minimums is generated when more than two arguments, all of which must be data series, follow the MIN command, i.e., MIN output_series input_series_1 input_series_2 ... The same restrictions as apply to the MAX function apply to MIN. 6.7 Modular Division SORITEC Sampler performs modular division via the MOD command, which has the following format: MOD remainder dividend divisor In mathematical notation, the formula used is: remainder = dividend - (INT(dividend/divisor) * divisor) where INT is the integer part of the quotient within parentheses. The dividend and divisor must be of the same type and may be CONSTANTs, PARAMETERs or data series with the resulting "remainder" being the same type. Modular division is useful for generating sequences of uniform random numbers in SORITEC Sampler. 48 6.8 Compute Moving Average The moving average of a series is calculated by the MA command. MA output_series input_series length In the command line, "input_series" is the series to be averaged, "length" is the length of the moving average, and "output_series" is the resulting series. The argument, "length", may be a CONSTANT, PARAMETER, or a numeric quantity. The first n observations of the output_series, equivalent to the length of the moving average are treated as MISSING data. 6.9 Compute Moving Sum The MSUM command compute the moving sum of a series. MSUM output_series input_series length Arguments in the command line have the same meaning as the MA command. The first n observations of the output_series, equivalent to the length of the moving sum, are treated as MISSING data. 6.10 Statistical Operations Several statistical functions are available for analyzing and manipulating data. They are described in the following sections. 6.10.1 Correlation Matrix Calculation A correlation matrix for the variables in an argument list is generated by the CORREL command. The format of the command is: CORREL series_1 series_2 series_3 ... Only observations active in the currently defined USE period are used in correlation matrix calculations. While only the correlation matrix is output to the terminal, the correlation matrix (COR), vector of means (MEANS), vector of standard deviations (DEVS) and covariance matrix (COV) are calculated by CORREL and stored as SORITEC internal variables. These results may be accessed with a RECOVER command. 6.10.2 Covariance Matrix Calculation The COVA command computes, stores and prints a covariance matrix for the variables named as arguments in the command line. The format of the command is: COVA series_1 series_2 series_3 ... Similar to the CORREL command, only observations associated with the currently active USE period are used in calculations. The vector of means (MEANS), vector of standard deviations (DEVS) and covariance matrix (COV) 49 are stored as SORITEC internal variables when the COVA command is executed, and may be accessed by the RECOVER command. 6.10.3 Other Statistical Operations Several specialized statistical operations are supported by SORITEC Sampler to describe the properties of a time-series. All operations have a standard format which consists of the command name, followed by the output variable and the input series, i.e., COMMAND output_constant input_series Statistics are calculated over the currently active USE period. The statistical operations available in SORITEC Sampler and commands for executing them are: Command Description ------- ----------- MEAN mean input_series Arithmetic Mean RMS root_mean_square input_series Root Mean Square SUM sum input_series Arithmetic Sum SSR sum_squared_resids input_series Sum of Squared Residuals 50 Chapter 7 SORITEC Financial Functions 7.0 Financial Functions in SORITEC SORITEC Sampler contains most of the common financial analysis functions. These functions used alone or with SORITEC's forecasting com- mands provide an extremely powerful tools for performing financial project evaluation. The functions currently provided include internal rate of return, present value, and various loan amortization schedules. Note that in all SORITEC Sampler financial functions, interest rates are treated as decimal quantities unless otherwise noted; specifically, 15% is represented as 0.15. 7.1 Internal Rate of Return The internal rate of return command calculates the internal rate of return for an arbitrary series "X" via a modified Newton-Raphson search algorithm. The format of the command is IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) & interest_rate net_income_series where "interest_rate" is a legal SORITEC constant name for the resulting interest rate which discounts the "net_income_series" to a zero net present value. Alternatively, the IRR command can be used to calculate the internal rate of return on the profits or benefits associated with a project with known costs. In this situation, the form of the command is: IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) & interest_rate benefits costs Here, the second series is subtracted from the first in calculating the IRR. The optional modifiers in the command line allow the user to control the parameters determining convergence for the algorithm as well as speci- fication of an arbitrary start-up capital cost. Specifically, CAPITAL is the start-up cost of the project. It is auto- matically subtracted from the first period profits. ITER is the maximum number of iterations for the search. The default is 50. 51 TOL is the tolerance level that defines convergence. An absolute or relative change in the net present value of less than TOL results in convergence. The default value is .00001. INITIALR allows the user to specify a starting value for the iterations. This is of special value in finding multiple roots to the IRR equation when cash flows change signs more than once during the life of the project. 7.2 Present Value The present value command, PV, calculates the net present value of a stream of net benefits (or profits) associated with a financial venture. PV will take either a scalar value for the interest rate or a time series of forecast values. This later feature, when combined with the estimation and forecasting capabilities of SORITEC Sampler, provides a powerful tool for simulating and evaluating financial projects. The syntax of the com- mand is: PV([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>]) & present_value net_income_stream <costs> interest_rate where "present_value" is a scalar value equal to the present value of the income stream, "net_income_stream" is the net income stream to be discounted, and "interest_rate" is the interest rate used in calculating the present value. The interest rate can be either a scalar, fixed for all periods, or a time series of interest rates. This allows for easy incorporation of interest rate forecasts into project evaluation. The "net_income_stream" can be followed by an optional cost series. This second argument in the command line can be either a single net income stream or a pair of series describing the revenues and costs of the project. The optional modifiers in the command line allow the user to convert the periodicity of the interest rate to conform to the net income stream and to specify the type of conversion to be performed. Specifically, PERIOD allows an interest rate conversion to be spec- ified; specifically, setting PERIOD equal to one of the options results in the specified interest rate being converted from the selected periodicity to the period- icity of the current USE period. The periodicity may be (D)aily, (W)eekly, (T)en Day, (M)onthly, (Q)uarterly, (S)emi-annual or (A)nnual. A second option, specified either as SIMPLE or COMPOUND, is the type of conversion to be used. The default is COMPOUND conversion. The PERIOD modifier used with the conversion option can handle trans- formations between annual or effective interest rates and the effective 52 periodic percentage rates. If the annual rate is given as 15%, the effec- tive annual percentage rate is 16.0754% - calculated as .15/12 = 1.25% compounded monthly. For example, PV(PERIOD=A,SIMPLE) pv_result PROFIT .15 will correctly convert the 15% annual percentage rate to a 1.25% monthly rate before calculating the present value. If the available data are given in terms of effective yields, the COMPOUND option should be used to correctly convert rates between periods. A loan requiring 4% per quarter is equivalent to a loan rate of 1.316% compounded monthly [exp(ln(1.04)/3)- 1]. Here, the appropriate command would be: PV( PERIOD=Q, COMPOUND ) pv_result PROFIT .04 7.3 Loan Amortization The loan amortization procedure (AMORT) provides a convenient technique for calculating the monthly payment for a given loan situation. In addition to the standard loan value and interest rate setup, AMORT also supports an arbitrary number of loan payment series, balloon payments, variable interest rates, as well as options for dynamically extending the amount of the loan through additional borrowings. The format of the com- mand is: AMORT([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>], & [RULEOF78],[BALLOON=#]) & payment loan interest_rate [aux_pay_1 ... aux_pay_n] where "payment" is the resulting per period payment to fully amortize the loan during the current USE period, and "loan" is the amount of the loan. The loan can either be a constant or a it can be a time-series if the loan is allocated over the time period set in the USE command. "interest_rate" is the interest rate of the loan. It must be the same type, either con- stant or time-series, as the "loan". The optional command line arguments, "aux_pay_i" are time-series of auxiliary payments in addition to the monthly loan payment. These can be used to enter payments to principal that are awkwardly or randomly timed. For example, a loan which required balloon payments of $5000 every five years can be handled as a time-series with value 5000 for every fifth year and zeros elsewhere. The optional modifiers in the command line allow the user to change the amortization schedule as follows: PERIOD is the same as for PV. It allows an interest rate conversion to be specified; specifically, setting PERIOD equal to one of the options results in the specified interest rate being converted from the selected periodicity to the periodicity of the current USE period. The periodicity may be (D)aily, (W)eekly, (T)en-Day, (M)onthly, (Q)uarterly, (S)emi-annual or (A)nnual. 53 RULEOF78 constructs a principle and interest payment series for the loan according the the "Rule of 78" (sum of the months). This option is only valid for loans with a single period of borrowing and a fixed interest rate. BALLOON allows the specification of a balloon payment in the final period. 54 Chapter 8 SORITEC Sampler Cross-Section Techniques 8.0 Introduction The full version of SORITEC contains most of the common techniques for processing and analyzing cross-sectional data sets and, in addition to providing access to most of the intermediate and final results, also imple- ments several diagnostic tests not reported by most statistical packages. The specific subset of techniques currently implemented in SORITEC Sampler are as follows: SYNOPSIS provides a quick statistical summary of a data series. XTAB carries out a standard r * c contingency table analysis including tests of independence. 8.1 Synopsis The SYNOPSIS command returns a detailed summary analysis of a data series including mean, standard deviation, median (including a 95% confidence interval), mode, quartiles, deciles, variance, skewness, kurto- sis, coefficient of variation, number of observations, number of missing values, minimum, maximum, range, mode and the frequency of the mode. The command format of SYNOPOSIS is: SYNOPSIS var_1 var_2 ... var_n In addition to outputting them to the terminal, SYNOPSIS stores the summary statistics as SORITEC internal variables, which may be recovered either explicitly with the RECOVER command or by implicit reference. See the description of the RECOVER command in Section 2.7 to retrieve these data. Except for DECILE AND QUARTILe statistics, internal variables asso- ciated with the SYNOPSIS command are stored as vectors that have the same number of elements as arguments in the SYNOPSIS command line. Recoverable SORITEC internal variables stored as vectors are: COUNT = number of non-missing observations for each variable MEDIAN = median value for each variable MIN = minimum values MAX = maximum values RANGE = range for each variable (max - min) MEANS = mean values VARS = variances for each variable DEVS = standard deviations CV = coefficient of variation for each variable KURT = kurtosis of each variable SKEW = skewness for each variable MODE = mode values for each variable 55 Two other internal variables are stored upon execution of the SYNOPSIS command. The variables are: DECILE = decile values of a series QUARTIL = quartile values of a series Currently, the DECILE and QUARTIL internal variables are stored as vectors meaning that decile and quartile values are stored for the last argument in the command line, only. Quantiles are defined as the first observations less than or equal to the true mathematical quantiles (n/4 and n/10) in both cases. Note that SYNOPSIS exercises casewise deletion of missing values on each variable when it computes the summary statistics. Because of this, the statistics may not compare with those from other SORITEC statistics com- mands like STATS, KURTOSIS, etc. 8.2 Crosstabulation Analysis The XTAB command calculates the standard r * c crosstabulation report. The format of the command is: XTAB series_1 series_2 The arguments "series_1" and "series_2" must be discrete data. If the series you wish to crosstabulate are continuous, they must be converted via the RECODE command. XTABs doesn't delete missing values, but instead, reports them as a separate category "MISSING" in the appropriate row or column. In addition to printer-oriented output, XTABs has an interactive screen display mode which allows scrolling through the table in a "spread- sheet" mode. This feature is described in Chapter 10. XTAB stores the following internal results. The full table is stored only when the NOMATS option is OFF. ^NROW = number of distinct row values (variable #1) ^NCOL = number of distinct column values (variable #2) ^RMARGIN = a nrow x 1 vector containing the row margin values ^CMARGIN = a ncol x 1 vector containing the column values ^XTABLE = nrow by ncol matrix composing the inner table 56 Chapter 9 Estimation and Forecasting with SORITEC Sampler 9.0 Introduction The SORITEC Sampler provides you with several single-equation estima- tion techniques for both single equation and simultaneous equation models. Both ordinary least squares (OLS) and two-stage least squares regression estimators are available. In addition, both the Cochrane-Orcutt and Hildreth-Lu autocorrelation techniques for the single-equation model are supported by SORITEC Sampler. These procedures may be applied to either time-series or cross-section data. However, the structure of the equations in any model to be estimated must be linear. The fitted equations of all linear models estimated by SORITEC Sampler can be recovered and forecast. The standard output from a SORITEC estimation command consists of a coefficient tableau and a summary tableau of regression diagnostics which includes the number of observations, the standard error of the regression, mean of the dependent variable, R squared, R Bar squared, Durbin-Watson, F test of overall significance, the log-likelihood, and the Akiake and Schwarz statistics for model selection. The user may have the estimator generate additional diagnostics by setting one or more options with ON commands, which must be executed before the regression command. Use of these options is described in Chapter 2. SORITEC estimation procedures support ON VCOV, ON STATS, ON CCOR, ON ANOVA, ON PLOT, ON RESIDUAL and ON BETA commands. These options are associated with SORITEC's interactive tableaus and are described in Chapter 10. When the ON CRT option is invoked, all estimation commands described in this chapter support the display in interactive tableaus of regression diagnostics. These tableaus provide the user with a greater number of regression diagnostics than are output by the estimation commands in their default modes. Commands for invoking the interactive tableaus and descrip- tions of their contents are detailed in the next chapter. 9.1 Ordinary Least Squares (OLS) Estimation The ordinary least squares estimator is invoked by the REGRESS command which has the following syntax. REGRESS [(ORIGIN)] dep_var ind_var1 ind_var2 ... ind_varn The dependent variable must be the first argument in the variable list, with the independent variables following immediately as the second through last arguments. The keyword ORIGIN is optional and, if specified, forces SORITEC Sampler to estimate the equation without a constant term. Other- wise, the constant term is supplied automatically, not by the user. If ORIGIN is specified in the command line, it must be enclosed within paren- 57 theses. When the regression plane is forced through the origin, the regression diagnostics are adjusted accordingly. 9.2 Autocorrelation Techniques for the Single Equation Model Two estimation techniques are available for estimating single equation models when the user believes that the error terms are not independent, but that a disturbance in one period influences later disturbances. The Cochrane-Orcutt (CORC) iterative technique and the Hildreth-Lu (HILU) scan- ning technique estimate models assuming first order serial autocorrelation of the disturbances. When either autocorrelation technique is invoked, SORITEC Sampler temporarily shortens the USE period by one observation at the beginning of the sample and by one observation after every gap to calculate the required transformed data. The USE command in force, therefore, should include the observations which are lost in the transformation of variables. The USE period is then restored to its original interval(s) after the command is completed. Regression diagnostics are calculated from the residuals of the regression on the transformed variables. 9.2.1 Cochrane-Orcutt Iterative Technique The Cochrane-Orcutt estimator is invoked by the command: CORC [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n Command syntax considerations are identical to those associated with the REGRESS command described in the previous section. 9.2.2 Hildreth-Lu Scanning Technique In addition to the dependent and independent variable lists, the HILU command requires that the lower and upper limits to the value of rho and its stepsize during the scanning process be initialized. These values are entered by the user into the command line by a set of positional parameters that are optional. The syntax of the HILU command is: HILU [([ORIGIN] ROMIN ROMAX ROSTEP)] dep_var & ind_var_1 ind_var_2 ... ind_var_n where the dependent and independent variable lists are positioned similar to the other regression commands. ROMIN is an optional positional parame- ter that defines the lower limit of rho. Similarly, ROMAX specifies the upper limit to rho. The stepsize of the scanning process is defined by the third positional parameter, ROSTEP. If omitted from the command line, these parameters assume default values of 0.0, 1.0 and 0.1, respectively. The user can selectively initialize these parameters by entering the wild card symbol * in positions where default values are to be assumed and the desired numeric values in the other positions. For example, the command: 58 HILU (* * .05) dep_var ind_var_1 ind_var_2 ... ind_var_n initializes ROMIN and ROMAX to their default values of 0.0 and 0.1, respec- tively, and sets ROSTEP to the user-selected value of 0.05. If positional parameters are entered into the command line, they must be enclosed within parentheses. 9.3 Two-Stage Least Squares (2SLS) Estimates Consistent estimates for a single equation from a simultaneous equa- tion system can be obtained by using a two-stage least squares (2SLS) estimator. Unlike the other estimation commands in this chapter, the 2SLS procedure requires the user to enter two commands to estimate an equation. First, all exogenous variables must be identified in an the EXOGENOUS statement, which has the form: EXOGENOUS exog_var1 exog_var2 ... exog_varn All arguments associated with this command are exogenous variable names. The EXOGENOUS command must be specified before invoking the 2SLS estimator. After execution, all later 2SLS commands use the same list of exogenous variables until another EXOGENOUS command is entered. Two-stage least squares estimation is invoked by the TWOSTAGE command which has the form: TWOSTAGE [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n All arguments plus the ORIGIN keyword in the command line have the same interpretation as used in the REGRESS command. Two-stage least squares commands that detect omitted or mis-specified exogenous variables generate error messages until a valid EXOGENOUS command is executed. 9.4 Forecasting Single Equation Models Any single-equation model that has been estimated by SORITEC Sampler can be forecast using the fitted equation that is stored as a SORITEC internal variable. To forecast an equation, all of the independent or right-hand variables that were used to estimate it must be defined for the period over which the forecast is to be made. These values may be observed, projected, assumed or may be the product of other forecasts. While forecasting results from the execution of a single command, a series of commands must be executed to generate meaningful results. (1) Estimate a single equation model using the REGRESS, CORC, HILU or TWOSTAGE command. (2) Change the active observation period to the forecast period with the USE command. 59 (3) RECOVER the fitted equation from its internal system name of FOREQ. (4) Use the FORECAST command to forecast the fitted equation over the desired time period. The format of the FORECAST command is: FORECAST fitted_equation_name Since SORITEC internal system names may be referenced directly from the FORECAST command, step (3) is optional. In this case, the fitted equa- tion is forecast simply by entering: FORECAST ^FOREQ Use of the RECOVER command is necessary, however, if you want to FORECAST the fitted equation after estimating other models since SORITEC replaces ^FOREQ each time an equation is estimated. Fitted equations can be databanked like most other SORITEC items. Forecasting single equation models in SORITEC Sampler is illustrated in the example below. USE 1975Q1 1982Q4 REGRESS gnp consumption investment(-1) RECOVER gnp_equation FOREQ USE 1983Q1 1984Q3 FORECAST gnp_equation PRINT gnp If the fitted equation is not need after being forecast, the command sequence is: USE 1975Q1 1982Q4 REGRESS gnp consumption investment(-1) USE 1983Q1 1984Q3 FORECAST ^FOREQ PRINT gnp The FORECAST command executes only a static forecast. This means that lagged independent variables are not automatically generated for each successive period but instead must be supplied during the forecast. In other words, the command sequence: USE 1980Q1 1984Q4 REGRESS gnp gnp(-1) USE 1985Q1 1985Q4 FORECAST ^FOREQ is illegal and generates an error if there are no data for "gnp" beyond 1985Q1. Note that the FORECAST command stores the forecasted values of the dependent variable under the same name as the dependent variable previously 60 defined. This means that any existing values for the dependent variable over the forecast period are replaced and cannot be retrieved. All existing values for the dependent variable outside the forecast period are retained, however, with the result that forecasted values are spliced into the original series as though the REVISE command has been used. To preserve existing values, the dependent variable series should first be copied to another series name or databanked before forecasting the fitted equation, e.g., USE 1975Q1 1982Q4 REGRESS gnp consumption investment(-1) RECOVER gnp_equation FOREQ USE 1983Q1 1984Q3 temp_gnp = gnp FORECAST gnp_equation PRINT gnp temp_gnp As values for "temp_gnp" are MISSING prior to 1983Q1 (since the active USE period was 1983Q1 to 1984Q3 when the transformation was executed), the original series is recreated by the command sequence: USE 1983Q1 1984Q3 REVISE gnp = temp_gnp Alternatively, copy both estimation and forecast period observations to temporary variables before forecasting an equation. 61 Chapter 10 SORITEC Interactive Print Server 10.0 Introduction SORITEC Sampler allows complete control over the output presentation for selected procedures. In REGRESS and CROSSTAB the user controls the order and depth of the presentation of the results. REGRESS generates 10 separate output summaries which may be selected, or repeated, in any order that you desire. CROSSTABS allows you to scroll through the crosstabs table in a "spreadsheet" mode, or switch to the table of summary statis- tics. In addition, a HELP menu is provided which describes each display option. The interactive regression display supports 10 different screen dis- plays including 3 tables of residual summaries, a residual plot, the covariance matrix of coefficients, the correlation matrix of coefficients, extended regression reports (beta coefficient, partial r and elasticities), a regression summary table, the ANOVA table for goodness of fit, means and standard deviations of the independent variables and of course the regres- sion estimates. When the interactive mode is in effect, a selection menu appears on the last line of the screen. Entering a ? will bring up a more detailed help menu regarding the contents of each display. Selecting an invalid choice sounds the "bell" and prompts you for another choice. There are several additional special keystrokes, in addition to those in the selec- tion menu, that control interactive display. Entering a carriage return, a '+' or a space advances the display to the next tableau in the selection menu. Entering a backspace returns you to the previously displayed tab- leau. Entering a '-' displays the previous screen in the selection menu. The interactive option is available for REGRESS, TSLS, CORC, and HILU. 10.1 Entering Interactive Mode To enable the interactive mode you must turn on the option by entering the command: ON CRT When this option is enabled, SORITEC Sampler automatically switches into an interactive presentation whenever a command is executed that supports the interactive tableaus. To stop the interactive presentation, enter OFF CRT. SORITEC Sampler will resume normal output presentation. 62 10.2 Tableau Descriptions The following sections discuss each tableau and their associated menu selection codes available with SORITEC estimation commands. 10.2.1 Coefficient Display (E) Coefficient estimates are automatically displayed when the regression equation is estimated. The presentation shows the technique, the current sample period, coefficients, standard errors, t-values and the significance levels of the t statistic. 10.2.2 Regression Summary Table (G) The regression summary table provides a quick synopsis of the regres- sion. The table reports the number of observations, mean of the dependent variable, the log-likelihood ratio, Schwarz and Akaike criteria, R-squared (adjusted), the standard error of the regression, Durbin-Watson and F- statistics and the significance of the F-statistic. If the ORIGIN option is specified, the statistics are adjusted appropriately. 10.2.3 Residual Autocorrelation Summary (R) The residual summary table provides information on the distribution of the residuals (mean, variance, skewness, kurtosis, minimum, maximum, average absolute error, etc.) and the autocorrelation structure of the residuals with Durbin-Watson ( for one, four and 12 periods) and the first 24 Box-Pierce statistics. All these statistics, along with the first 24 autocorrelation coefficients, may be recovered for later analysis. 10.2.4 PDF and Histogram of Standardized Residuals (H) This table provides a quick summary of the distribution of the resi- duals for quick identification of outliers or a skewed distribution, and shows the percentage of residuals falling between each integer multiple of the regression error variance, including a histogram of the same infor- mation. The histogram information has a higher resolution than the table since each line of the screen represents 1/3 of a standard deviation. Because of this, scale may at times appear to be off somewhat; specifical- ly, if the maximum table value is 40% the maximum vertical value on the plot might be, say, 17%. 10.2.5 Non-Parametric Residual Distribution Tests (N) This table provides a set of statistical tests on the normalcy of the residual distribution as well as tests of the randomness of the residuals. Specifically, SORITEC Sampler carries out a "Run of Signs" test for random- ness, a chi-square test against the normal distribution, and a Kolmogorov test for normality. 63 10.2.6 Regression ANOVA Table (A) This is the standard ANOVA table showing the derivation of the F- statistic reported in the summary table. Similar to the summary table, all reported statistics are adjusted appropriately when the regression equation is constrained through the origin. ON ANOVA will activate this output when the OFF CRT flag, or non-interactive mode, is set. 10.2.7 Covariance Matrix of Coefficient Estimates (V) This tableau displays a variance-covariance matrix of the coeffi- cients. It is equivalent to the display produced by the ON VCOV option when the OFF CRT option is set. 10.2.8 Correlation Matrix of Coefficient Estimates (C) Although there is little theory regarding the correlation matrix of coefficient estimates, it does provide a quick way to examine the relation- ship between pairs of coefficients. ON CCOR will present this display in when SORITEC Sampler is in OFF CRT mode. 10.2.9 Beta Coefficients, Elasticities and Partial R (B) This tableau presents coefficient estimates and their associated Beta coefficients, elasticities and partial correlation coefficients. ON BETA enables this display when the OFF CRT option is set. 10.2.10 Statistical Summary of Exogenous Variables (S) This table reports the mean and standard deviation of the independent variables. When the OFF CRT option is set, this display is activated by ON STATS. 10.2.11 Actual vs Fitted Plot and Standardized Residuals (P) This display shows the actual versus fitted and standardized residuals for the regression. The plot is produced in a form that is reproducable by line printers unless your PC has an IBM color graphics compatible display. In that case, the plots appear in 3-color medium resolution mode. ON PLOT activates this output when the OFF CRT option is set. 64 10.3 Interactive Crosstabs The XTAB command allows for interactive scrolling through the table in a spreadsheet manner along with the option to present the summary statis- tics for the current table. In this mode, keys are interpreted as follows: (X) move down one screen, (S) move left one screen, (D) move right one screen, (E) move up one screen, (T) to view the summary table of test statistics, and (Q) to quit the crosstabs. 65 APPENDIX I SORITEC INTERNAL SYSTEM NAMES -------------------------------------------------------------------------- INTERNAL TYPE PRODUCED SYSTEM OF BY NAME ITEM COMMANDS* LENGTH DESCRIPTION -------------------------------------------------------------------------- CCOR MATRIX (5) NV**2 CORRELATION MATRIX OF COEFFICIENTS COEF VECTOR (5) NV REGRESSION COEFFICIENTS COR MATRIX CORREL NARGS**2 CORRELATION MATRIX COV MATRIX COVAR, CORREL NARGS**2 COVARIANCE MATRIX DEP ALPHANUMERIC (2),(3), 1 NAME OF DEPENDENT ITEMS ALMON,REGRESS, VARIABLE TWOSTAGE DEVS VECTOR STATS,CORREL NARGS STANDARD DEVIATIONS OF VARIABLES DW CONSTANT (5) DURBIN-WATSON STATISTIC FACTOR VARIABLE ADJUST NOBS SEASONAL FACTOR SERIES FOREQ EQUATION REGRESS, N/A FITTED EQUATION FOR TWOSTAGE FORECASTING GAPS CONSTANT USE NUMBER OF GAPS IN CURRENT USE COMMAND ITERS CONSTANT (2),(3),(4) ITERATIONS USED IN ARRIVING AT COEFFICIENTS LAGCOi VECTOR ALMON NDEGi+1 LAG COEFFICIENTS ON iTH DISTRIBUTED LAG VARIABLE LAGSEi VECTOR ALMON NDEGi+1 STANDARD ERRORS OF LAG COEFFICIENTS LAGCO(i) LAGSUMi CONSTANT ALMON SUM OF LAG COEFFICIENTS FOR iTH DISTRIBUTED LAG VARIABLE MEANS VECTOR STATS,CORREL NARGS MEANS OF VARIABLES MLAGi CONSTANT ALMON MEAN LAG FOR iTH DISTRI- BUTED LAG VARIABLE NARGS CONSTANT COVAR,CORREL, NUMBER OF VARIABLES IN STATS ARGUMENT LIST 66 APPENDIX I (cont'd) SORITEC INTERNAL SYSTEM NAMES -------------------------------------------------------------------------- INTERNAL TYPE PRODUCED SYSTEM OF BY NAME ITEM COMMANDS* LENGTH DESCRIPTION -------------------------------------------------------------------------- NDEGi CONSTANT ALMON DEGREE OF iTH DISTRIBUTED LAG VARIABLE NEQ CONSTANT (4) NUMBER OF EQUATIONS ESTIMATED NGAPS CONSTANT (5) NUMBER OF GAPS IN USE USED FOR LAST REGRESSION NOBS CONSTANT (5) NUMBER OF OBSERVATIONS USED IN LAST REGRESSION NV CONSTANT REGRESS,(2),(3), NUMBER OF INDEPENDENT TWOSTAGE RIGHT-HAND VARIABLES IN LAST REGRESSION NV CONSTANT (4), ALMON NUMBER OF COEFFICIENTS OR PARAMETERS ESTIMATED BY LAST (4) OR ALMON COMMAND OBS CONSTANT USE NUMBER OF OBSERVATIONS IN CURRENT USE RAWEQ EQUATION REGRESS, N/A USER'S ORIGINAL TWOSTAGE UNFITTED EQUATION REGSE CONSTANT (2),(3), ALMON STANDARD ERROR OF REGRESS,TWOSTAGE REGRESSION RHO CONSTANT (2) 1ST-ORDER AUTO-CORREL- ATION COEFFICIENT RHO VECTOR (3) 2 1ST-ORDER AND 2ND-ORDER AUTO-CORRELATION COEFFICIENTS RSQ CONSTANT (5) R-SQUARED RSQADJ CONSTANT (5) R-SQUARED ADJUSTED FOR DEGREES OF FREEDOM SE VECTOR (5) NV COEFFICIENT STANDARD ERRORS 67 APPENDIX I (cont'd) SORITEC INTERNAL SYSTEM NAMES -------------------------------------------------------------------------- INTERNAL TYPE PRODUCED SYSTEM OF BY NAME ITEM COMMANDS* LENGTH DESCRIPTION -------------------------------------------------------------------------- SSR CONSTANT ALMON,REGRESS SUM OF SQUARED TWOSTAGE,(2),(3) RESIDUALS VCOV MATRIX (5) NV**2 VARIANCE-COVARIANCE MATRIX OF COEFFICIENTS YFIT VARIABLE (5) NOBS FITTED VALUES YMEAN CONSTANT ALMON,REGRESS MEAN OF DEPENDENT TWOSTAGE,(2),(3) VARIABLE -------------------------------------------------------------------------- *INTERNAL RESULTS ARE PRODUCED BY THE COMMANDS ASSOCIATED WITH THE FOLLOWING NUMBERS: (1) REGRESS, TWOSTAGE, MVR, THREESTAGE (2) HILU, TSHILU, CORC, TSCORC (3) HILU2, TSHILU2, CORC2, TSCORC2 (4) MVR, THREESTAGE, nonlinear REGRESS, nonlinear TWOSTAGE (5) ALMON, (1), (2), (3) NOTE: Not all commands are available in SORITEC Sampler. 68 APPENDIX II GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC ---------------------------------------------- DEFAULT OPTION SETTING DESCRIPTION ------ ------- ----------- ALIAS OFF The ALIAS option controls the printing of variable names in output produced by SORITEC commands invoked from a PROCEDURE. It is not supported in SORITEC Sampler. ANOVA OFF When the OFF CRT option is in effect, ON ANOVA generates a standard ANOVA table with SORITEC estimation results showing the derivation of the F- statistic reported in the summary table. It is otherwise generated by the A-key in interactive mode. BETA OFF When the OFF CRT option is in effect, ON BETA generates the regression tableau that pre- sents coefficient estimates and their associated Beta coefficients, elasticities and partial correlation coefficients. This tableau is also generated by the B-key in interactive mode. BRIEF OFF Suppresses command number prompts in interac- tive mode, as well as messages reminding the user to close DO loops and procedures, and to satisfy outstanding GO TO's. CCOR OFF Correlation matrix of regression coefficients is printed after every regression. CRT OFF The CRT option is used with the PAGESIZE command to control SORITEC output to the CRT terminal. When the CRT option is ON, SORITEC prints only PAGESIZE or fewer lines of information before pausing. Entering a carriage return resumes output. ON CRT also enables the tableaus associated with SORITEC's estimation and XTAB commands. DETAIL OFF Not implemented at this release. DIVZERO ON Not implemented at this release. DOLLAR OFF When the DOLLAR flag is turned ON, dollar signs in SORITEC input are inter- preted as semicolons (statement separators). Use of this feature is not recommended and the flag will be removed in a future release. 69 APPENDIX II(cont'd) GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC ---------------------------------------------- DEFAULT OPTION SETTING DESCRIPTION ------ ------- ----------- DYNAMIC OFF Causes transformations involving lagged variables to be performed dynamically instead of statically. ECHO OFF Echos input lines to output device. GROUP OFF Enables automatic group expansion in commands. HEAD ON Prints standard headings on each page (batch runs only). JOURNAL OFF The JOURNAL flag controls writing of inte- ractive input to the journal file. It is set OFF when SORITEC begins execution and is set ON when interactive processing mode is invoked by the HELLO command. LOG OFF Not implemented at this release. MISSING ON Causes warning messages to print where the user accesses observations which never have been given a value. NEGEXP OFF Not implemented at this release. NEGLOG ON Not implemented at this release. NOEJECT OFF Not implemented at this release. NOERROR OFF Not implemented at this release. NOMATS ON Saves workspace by suppressing storage of the VCOV, CCOR, and RAWEQ internal results after each regression. PERFECT OFF Not implemented at this release. PLOT OFF Plots actual versus fitted values of the dependent variable after every regression. The plot is generated in a form reproducable by line printers unless your PC has an IBM color graphics compatible display, in which case it appears in 3-color medium resolution mode. 70 APPENDIX II(cont'd) GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC ---------------------------------------------- DEFAULT OPTION SETTING DESCRIPTION ------ ------- ----------- PRINT OFF Controls printing of intermediate computa- tional results PROMPT OFF Not implemented at this release. RAGGED OFF When enabled, the RAGGED option allows you to assign fewer observations to a variable using the FILL command than are associated with the current USE period. Usually, an error message is generated when this condition exists. FILL assigns MISSING values to observations beyond the end of shorter series to the end of the USE period. ON RAGGED does NOT permit the entry of more observations than specified in the current USE period. RAWEQ ON The RAWEQ option, when enabled, stores the raw equation associated with any regression estimated by SORITEC under the internal variable name ^RAWEQ. Disabling the option saves symbol table space, since several coef- ficients are stored for each RAWEQ entry. REPLACE OFF When REPLACE is turned ON, the databanking KEEP command saves items on the currently ACCESSed databank regardless of whether name conflicts occur with items already stored in the databank. In other words, KEEP acts like a REPLACE command when this option is enabled. RESIDUAL OFF When the OFF CRT option is in effect, the RESIDUAL global option generates three of the tableaus associated with regression tableaus in CRT mode. These are: (1) the Residual Summary Table that provides information on the distribution of the resi- duals (mean, variance, skewness, kurtosis, minimum, maximum, average absolute error, etc.) and the autocorrelation structure of the residuals with Durbin-Watson ( for one, four and 12 periods) and the first 24 Box- Pierce statistics. (2) PDF and Histogram of Standardized Resi- duals, providing a quick summary of the dis- tribution of the residuals for quick identi- 71 APPENDIX II(cont'd) GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC ---------------------------------------------- DEFAULT OPTION SETTING DESCRIPTION ------ ------- ----------- fication of outliers or a skewed distribu- tion. It also shows the percentage of resi- duals falling between each integer multiple of the regression error variance, including a histogram of the same information. (3) Non-Parametric Residual Distribution Tests, providing a set of statistical tests on the normalcy of the residual distribution as well as tests of the randomness of the residuals. REVISE OFF Enables automatic splicing and updating of time-series. With REVISE set ON, all assignment and FILL statements behave as though they are prefixed by a REVISE command. This means that observations are added to existing series if the current USE period is outside the range of the USE period under which the data series was last defined. If the current USE period is a subset of the USE period under which the symbol was last de- fined, no truncation of the series occurs. SMPL OFF Not implemented at this release. STATS OFF Mean and standard deviation of all independent variables in a regression. STREAMIO OFF When enabled, this option allows formatted READ commands to read successive observations of a variable along a row, rather than down a column, as normally expected. TRAIL OFF When enabled, the TRAIL option generates a debug trail for diagnosing SORITEC bugs. UPRINT ON UPRINT controls the printing of underscores (_) in variable names. When enabled, SORITEC prints the underscores. VCOV OFF Variance-covariance matrix of regression coefficients is printed after every regression. 72 APPENDIX III QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS ACCESS filename ACCESS 'd:filename' ACCESS '\directory1\directory2\filename' AMORT([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>], & [RULEOF78],[BALLOON=#]) & payment loan interest_rate [aux_pay_1 ... aux_pay_n] COMPUTE equation_name [COMPUTE] transformation_expression CONSTANT const_1 [value_1] const_2 [value_2] ... CONTENTS [filename] CONTENTS 'd:filename' CONTENTS '\directory1\directory2\file Statement_number CONTINUE CONVERT [(modifier)] input_series CONVERT [(modifier)] output_series = input_series COPY item_1 item_2 ... item_n CORC [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n CORREL series_1 series_2 series_3 ... COVA series_1 series_2 series_3 ... CREATE filename CREATE 'd:filename' CREATE '\directory1\directory2\filename' DISCARD item_1 item_2 ... item_n DO index = beginning_value TO end_value BY increment END DOT variable_1 variable_2 ... variable_n ENDDOT DUMMY output_series first_observation skip_increment END ENDDOT EQUATION equation_name [equation] EXECUTE filename EXECUTE 'd:filename' EXECUTE '\path\filename' EXOGENOUS exog_var_1 exog_var_2 ... exog_var_n FILL variable_name value_list FLAGS flag_vector FORECAST fitted_equation_name FORECAST ^FOREQ Statement_number FORMAT format_specification *FORGET [item_name] ----------------------------- * denotes commands that accept wildcard characters in arguments. 73 APPENDIX III(cont'd) QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS GO TO statement_number (also GOTO) *GROUP group_name name_1 name_2 ... name_n HELLO HILU [([ORIGIN] ROMIN ROMAX ROSTEP)] dep_var ind_var_1 & ind_var_2 ... ind_var_n IF condition; THEN; command_sequence_1; ELSE; command_sequence_2 IMPUTE [ZERO|MEAN|INTER|TREND|NONE] IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) & interest_rate net_income_series IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) interest_rate benefits costs JOB job_label KEEP item_1 item_2 ... item_n KEEP(ACTIVE) item_1 item_2 ... item_n MA output_series input_series length MAX maximum_value input_series MAX output_series input_series_1 input_series_2 ... MAXERR number MEAN mean input_series MIN minimum_value input_series MIN output_series input_series_1 input_series_2 ... MISSING constant_name MOD remainder dividend divisor MSUM output_series input_series length OFFLIST ONLIST PARAMETER param_1 [value_1] param_2 [value_2] ... PLOT series_1 symbol_1 series_2 symbol_2 ... PRINT arg_1 arg_2 arg_3 ... PUNCH series_1 series_2 ... PUNCHDIF[(filename)] arg_1 arg_2 arg_3 ... PUNCHDIF('[d:][\path\]filename') arg_1 arg_2 arg_3 ... PURGE filename PURGE '[d:][\path\]filename' PV([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>]) ... present_value net_income_stream <costs> interest_rate QUIT ----------------------------- * denotes commands that accept wildcard characters in arguments. 74 APPENDIX III(cont'd) QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS READ(filename) READ('[d:][\path\]filename') READ([filename] [statement_number]) series_1 series_2 ... READ(['[d:][\path\]filename'] [statement_number]) & series_1 series_2 ... READDIF(filename) READDIF('[d:][\path\]filename') READDIF(filename) series_1 series_2 ... READDIF([filename] statement_number) series_1 series_2 ... RECODE output_series input_series p(1) p(2) p(3) p(4) ... RECOVER [new_name] internal_name REGRESS [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n RENAME new_name_1 old_name_1 new_name_2 old_name_2 ... REPLACE item_1 item_2 ... item_n RETURN REVISE transformation_expression RMS root_mean_square input_series SCAN number SCATTER series_1 series_2 SSR sum_squared_resids input_series SUM sum input_series SWITCH item_1 item_2 SYNOPSIS var_1 var_2 ... var_n *SYMBOLS [ALL] TIME [series_name] TITLE [label] TWOSTAGE [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n USE [begin_1] [end_1] [begin_2] [end_2] ... USEIF expression VECTOR vector_name value_1 value_2 ... WIDTH number WRITE([filename] [statement_number]) var_1 var_2 ... WRITE(['[d:][\path\][filename]'] [statement_number]) & var_1 var_2 ... WRITE([filename] [statement_number]) constant_1 (time_series_1 & time_series_2) constant_2 WRITE(['[d:][\path\][filename]'] [statement_number]) & constant_1 (time_series_1 & time_series_2) constant_2 XTAB series_1 series_2 ----------------------------- * denotes commands that accept wildcard characters in arguments. 75 APPENDIX IV DETAILED FEATURE LIST FOR SORITEC VERSION 1.06B 1. REGRESSION TECHNIQUES Ordinary Least Squares Regression Advanced Single Equation Techniques Linear Ridge regression # Non-linear # With arbitrary diagonal matrix First-order Cochrane- or canonical scaling # Orcutt or Hildreth-Lu Second order C-O or H-L # Generalized least squares # Fast regression using the GLS with C-O # Cholesky Decomposition Restricted least squares # ARMA residuals # Theil-Goldberger mixed GLS autocorrelation estimation # estimation * Principal components analysis Minimax parameter estimation Stepwise Regression * Forward or backward methods Probit analysis # CP statistics Discriminant analysis * # Multiple levels for the inclusion of variables F Test of linear hypothesis F Test of non-linear Exponential Smoothing Techniques hypothesis # Single exponential, Brown's Calculate confidence intervals linear & quadratic, Holt's for non-linear functions of linear, adaptive response, coefficients # Winter's linear and seasonal Regression Diagnostics Standard errors and t-values Linear trend, S-curve and Sum of residuals exponential growth forecasting Sum of squared residuals Mean absolute residual Two-Stage Least Squares Significance of t values Linear Beta coefficients Non-linear # Partial R values First order C-O or H-L F statistic and significance Second order C-O or H-L # Residual analysis Fast two-stage using the Durbin-Watson 1st, 4th and Cholesky Decomposition 12th order Skewness and kurtosis Distributed Lag Models First 24 auto-correlation Almon coefficients and Box-Pierce Shiller Q statistics Ability to recover and forecast ANOVA table for regression with the unscrambled equation Elasticities of the Almon with C-O or H-L coefficients 76 Distribution tests on Residuals 3. FORECASTING AND SIMULATION Percentage distribution of residuals between -3 to +3 Single Equation Forecasting standard deviations Static forecast Dynamic forecasting Procedures allow for regression Residual feedback through the origin and adjust Non-linear forecasts the test statistics appropriately Statistics adjusted correctly for Multiple Equation Forecasting gaps in sample period Static simulation # Significance levels for all test Dynamic simulation # statistics Non-linear equations allowed # Conditional expressions in Interactive, table-oriented, output equations allowed # display for easy review of Simultaneous equation regression results capability # 2. SYSTEMS ESTIMATION TECHNIQUES Solution of simultaneous non-linear equations # Zellner's Seemingly Unrelated Automatic block-decomposition of Regression simultaneous models # Linear and non-linear # Successive over- and under- Iterative refinement of relaxation user-selectible # residual correlations (IRRC) Easy comparison of scenarios # optional # User control of convergence criteria Three-Stage Least Squares and values # Linear and non-linear # With IRRC # Full Information Maximum Likelihood 4. FINANCIAL AND ECONOMIC MEASURES Linear and non-linear # User selection of optimization Present value method, stepsize algorithm, Internal rate of return and convergence criteria # Depreciation Box-Jenkins Analysis Straight line, double-declining Autocovariance balance, sum of years digits, Autocorrelation ACRS and ADR schedules Partial autocorrelation and Loan amortization confidence intervals Peak to peak interpolation ARMA (p,q), and ARIMA (p,d,q) Capital stock accumulation # ARIMA with seasonal Capital utilization # differencing Net capital investment # Multivariate distributed lags Capital stock calculation # with ARMA errors # Calculation of economic capacity # Multivariate transfer functions Calculation of price indices # Common rational coefficients models # 5. CROSS-SECTIONAL AND SURVEY Linearized form models # TECHNIQUES Gaps allowed in lag structure # Selection of holdout or Casewise deletion of missing values backcasting # Frequency distributions Arbitrary initial errors Histograms allowed # Synopsis command Multiplicative form models # T-Tests of grouped or paired data 77 Analysis of Variance RECODE function to convert data ONEWAY and TWOWAY continuous ranges into discrete Any combination of fixed or indicators random factors Covariates allowed Convert periodicities between Unequal number of observations annual, monthly, quarterly, allowed weekly, daily and undated data Diagnostic testing included types (* for some combinations) Automatic determination of the appropriate analysis, i.e., Subscript ranges allowed in leads 1, or 2-way, with/without and lags, e.g., X(-1 TO -6) interaction terms expands to X(-1) X(-2) ...X(-6) Frequency and histogram options throughout the command syntax Replications supported Basic Statistics Crosstabulation tables Mean, standard deviation, mode, Nesting for multi-dimensional median, variance, skewness, tables kurtosis, range, deciles, Full set of test statistics quartiles, coefficient of Interactive "spreadsheet" mode variation, root mean square, for reviewing output correlation, covariance analysis, Z scores, minimum, Breakdown Analysis maximum, casewise deletion of Nested breakdowns missing values Histograms ANOVA testing Normalization of time series Seasonal dummy creation Non-Parametric Statistics Wilcoxon W+, signed rank test, Splicing function to merge two run of signs test, Mann- versions of the series into one Whitney U test, Spearman continuous series; including correlation, Kendall tau simple splice, sliding weights, or regression with sliding weights * Rank function for construction of other non- parametric tests, Analysis of Goodness of Fit e.g., non-parametric ANOVA, etc. Runs test, chi squared normality tests, Box-Pierce Q Recovery of all intermediate results statistics, frequency for cross-sectional procedures distribution of residuals, Most procedures support dynamic Durbin-Watson 1st, 4th and 12th recoding of continuous data to order discrete categories Most procedures support selection of DIF transformation to apply the nth a subset of discrete values for an difference operator to a series k analysis times. # Random Number Generators Beta, chi-squared, exponential, 6. TIME SERIES UTILITIES AND double exponential, F, OPERATIONS geometric, normal, Poisson, t, uniform Weighted/moving averages and sums Time series filter 78 Cumulative Density Functions 8. DATABANKING CAPABILITIES Normal, t, F, beta *, gamma *, chi-squared *, run of signs Maximum number of items in a data- CDF base limited only by disk space. Sorted contents listing Seasonal Adjustment Techniques Databank can store data series, Ratio to moving average for equations, vectors, matrices, monthly, quarterly, or and linked models arbitrary periodicity Simple one word database commands to Census X-11 * # create, access, update, copy, rename, switch, replace, list or 7. MATHEMATICAL FUNCTIONS AND discard database items. OPERATIONS Database usage identical across mainframe, minicomputer and Algebraic entry of transformations microcomputer versions # Logical operators supported Modular arithmetic function Sine, cosine, tangent, arc sine, arc 9. PROGRAMMING LANGUAGE cosine, arc tangent, log, log10, sinh, cosh, tanh, arc sinh, arc Structured Programming Language cosh, and arc tanh functions Features Ceiling, floor, round, sign, abs, User-defined procedures, random and inverse normal PDF labeled/ numbered statements, functions available global variables, local variables, recursion allowed, Substitution of missing values using GOTO, IF/THEN/ELSE, DO loops, zero, mean, interpolation or DOT loops (over alpha index), linear trend forecast values subscripted references allowed, external command files allowed Missing values propagate as missing in all math operations; Equations and transformations 0*MISSING propagates as 0 specified in algebraic form Logical operators can be specified Wildcards allowed in most commands in algebraic form, e.g., >, >=, <, Variable subscript references, e.g., <=, etc. X(K) (except in equations) Mixed logical and arithmetic operators allowed in expressions Lags can be specified as negative subscripts, e.g. X(-1) is the TSP-like matrix commands first lag Add or subtract two matrices, transpose a matrix, matrix Access to intermediate and final orthogonalization, triangular results using a keyword RECOVER matrix inversion, matrix command, or by item name e.g.: factorization, move vector to a RECOVER YFIT, diagonal matrix, extract or RESID = Y-^YFIT diagonal elements to a vector Namelist capability using GROUP Full algebraic matrix mathematics command e.g., B=INV(TR(X)*X)*TR(X)*Y, allows easy construction of Subscripted references to namelist complex estimators # elements allowed, e.g., if GROUP GRP1 contains X1 X2 X3 X4, then GRP1(3) is X3 79 LEGAL function allows the user to 12. GRAPHICS test for missing values and develop custom missing value Printer graphics and plots handling routines, e.g., casewise, mean substitution, etc. Medium resolution screen-oriented graphics 10. DATA ENTRY DIF I/O bridge to presentation- quality graphics programs Free-field data or FORTRAN formatted entry from disk or keyboard 13. GENERAL FEATURES DIF file I/O capabilities TROLL print format input * Batch and interactive modes DBase II I/O supported * available Can be interfaced with mainframe Item names may be thirty-two databases, e.g., Citibase, characters long Predicasts, IMF, OECD, etc. Equations may be recovered and Custom database interfaces and printed conversions to IBM PC/XT format available on a contract basis Full function command line editor allows the user to edit and rerun Commercial databases available on one or more previous commands diskettes for the PC and other non-mainframes (e.g., Citibase, User access to differentiation etc.) routine Input and Output journaling Data can be downloaded in SORITEC SORT command Alternate Load (.SAL) file Global control over plots, format from major data vendors statistics, etc. (DRI, WEFA, CITIBASE Connection) 14. PC VERSION SPECIFICS 11. REPORT-WRITING CAPABILITIES User may exit to the operating Simplified report layout with system, run other programs and complete user control of format, return to SORITEC session without titles, contents, footnotes, losing any work labels and currency symbols # DOS commands can be executed inside Automatic row/column subtotals, SORITEC, allowing editors, grand totals, averages, communications programs, etc. to products, differences, ratios be used in SORITEC procedures and percentages # Supports DOS redirection and use of Automatic footnoting # fully qualified file names for Store and recall report formats # access to subdirectories ____________________________________ Complex reports generated by a # Indicates features available only single command # in full SORITEC. All other features are in SORITEC. Specification of asterisks or blanks for small or missing values # * Available second quarter 1985. 80 Random Access Memory Required Recommended SORITEC 512K 640K 8087 high-speed math chip required for SORITEC Version 1.06B. Number of Diskettes: SORITEC - 5 (1.7 Megabytes) 81 INDEX A ACCESS ................................. 37 Actual versus fitted.................... 64 Alpha Looping........................... 43 AMORT................................... 53 ANOVA table............................. 64 Arithmetic Mean......................... 50 Arithmetic Sum.......................... 50 Autocorrelation techniques.............. 58 B Batch Processing........................ 10 Beta coefficients....................... 64 C Cochrane-Orcutt ........................ 58 COMPUTE ................................ 14,16 Compute Moving Average.................. 49 Compute Moving Sum...................... 49 Conditional branching................... 43 CONSTANT................................ 13 Constants............................... 13 CONTENTS................................ 40 CONTINUE................................ 43 CONVERT................................. 46,47 Converting time-series from one periodicity to another......... 45,46 COPY ................................... 38 CORC ................................... 58 CORREL ................................. 49 Correlation matrix...................... 49,64 Correlation Matrix Calculation.......... 49 COVA ................................... 49 Covariance matrix....................... 49 Covariance Matrix Calculation........... 49 CREATE.................................. 37 Cross-sectional data.................... 55 Crosstabulation Analysis................ 56 D Data Interchange Format (DIF) Files..... 28 Data types.............................. 15 Databanks............................... 37 DIF File Input.......................... 28 DIF File Output......................... 30 DISCARD ................................ 40 Distribution of the residuals........... 63 DO ..................................... 41 DOT..................................... 43,44 82 DUMMY .................................. 45 Dummy variables......................... 45 E Elasticities............................ 64 END..................................... 10,41 ENDDOT.................................. 44 EQUATION ............................... 14 Equations............................... 14 EXECUTE ................................ 11 Executing SAC Files..................... 10 EXOGENOUS .............................. 59 Exporting data.......................... 26 F FILL.................................... 19,33 Financial functions..................... 51 Fitted equation......................... 59 FLAGS................................... 22 FORECAST ............................... 60 Forecasting single equation models...... 59 FOREQ................................... 60 FORGET.................................. 24 FORMAT.................................. 31 Formatted input and output.............. 31 FORTRAN formatted input................. 31 FORTRAN formatted output................ 32 G Global options.......................... 22 GO TO (GOTO)............................ 42 Graphical Display....................... 34 Group expansion......................... 14 GROUP .................................. 14 H HELLO................................... 9 Hildreth-Lu............................. 58 HILU.................................... 58 I IF/THEN/ELSE............................ 43 Illegal transformations................. 17 Imputation of Missing Values............ 21 Importing data.......................... 26 IMPUTE ................................. 21 Input Journal Files..................... 11 Interactive mode........................ 62 Interactive Processing.................. 9 Interactive regression display.......... 62 83 Invoking SORITEC Sampler................ 9 Internal rate of return................. 51 IRR..................................... 51 J JOB..................................... 10 K KEEP ................................... 39 Keyboard Entry.......................... 33 L LEGAL................................... 20 Line printer-style graphics............. 34 Loan amortization....................... 53 M MA ..................................... 49 Mathematical functions.................. 16 Matrix.................................. 13 MAX..................................... 47,48 MAXERR ................................. 25 Maximum error limit..................... 25 Maximum Function........................ 47 Maximum value of a series............... 47 Mean and standard deviation of the independent variables......... 64 MEAN ................................... 50 MIN .................................... 48 Minimum Function........................ 48 Minimum value of a data series.......... 48 MISSING................................. 19,20 Missing Data Handling................... 19 Missing Value Symbol Declaration........ 20 Missing Value Logical Function.......... 20 MOD .................................... 48 Modifiers, in the CONVERT command....... 47 Modular division........................ 45,48 Moving average.......................... 49 Moving sum.............................. 49 MSUM ................................... 49 N Namelist................................ 14 Net present value....................... 52 Non-linear estimation................... 14 Null (Continuation) Statement........... 43 Numeric Looping......................... 41 84 O OFFLIST................................. 25 ON ANOVA................................ 64 ON BETA................................. 64 ON CCOR................................. 64 ON CRT ................................. 57 ON GROUP................................ 14 ON PLOT................................. 64 ON REVISE............................... 19 ON STATS................................ 64 ON VCOV................................. 64 ONLIST.................................. 25 Options................................. 22 Ordinary least squares.................. 57 ORIGIN.................................. 57 Output of Data to the Terminal.......... 34 P PARAMETER .............................. 13 Parameters.............................. 13 Partial correlation coefficients........ 64 Periodic dummy variable................. 45 PLOT ................................... 34 Prefix.................................. 44 Present value........................... 52 PRINT .................................. 34 PROCEDURE............................... 41 Programming language.................... 41 PUNCH .................................. 27 PUNCHDIF................................ 30 PURGE .................................. 38 PV ..................................... 52 Q QUIT.................................... 10 R READ.................................... 27,31,32 READDIF................................. 28 Recode a Variable....................... 46 RECODE.................................. 46 RECOVER................................. 22 REGRESS ................................ 57 Regression summary table................ 63 RENAME ................................. 39 REPLACE................................. 39 Residual summary table.................. 63 RETURN.................................. 38 REVISE.................................. 18 Revising Data .......................... 18 85 RMS .................................... 50 Root Mean Square........................ 50 S SAL files............................... 26 SAL File Input.......................... 27 SAL File Output......................... 27 SCAN.................................... 25 SCATTER ................................ 36 Seasonal Dummies........................ 45 Selection menu.......................... 62 Serial autocorrelation.................. 58 Series of minimum values................ 48 Single-equation estimation techniques... 57 SORITEC................................. 6 SORITEC DataBank Files.................. 36,37 Special Symbols......................... 12 SSR .................................... 50 Standardized residuals.................. 64 Statistical Operations.................. 49,50 Sum of Squared Residuals................ 50 SUM .................................... 50 SWITCH ................................. 40 Symbol table............................ 23 SYMBOLS................................. 23 SYNOPSIS................................ 55 T Tabular Display......................... 34 Time trend dummy series................. 45 TIME ................................... 45 Time-series variables................... 13 TITLE .................................. 25 Transformations......................... 16 Transforming continuous into discrete variables................. 45 Two-stage least squares (2SLS) ......... 59 TWOSTAGE ............................... 59 U Unconditional Branching................. 42 Uniform random numbers.................. 48 USE..................................... 15 USEIF .................................. 15 V Variable Names.......................... 12 Variable Types.......................... 13 Variance-covariance matrix.............. 64 Vector.................................. 13 86 VECTOR ................................. 13 W WIDTH .................................. 24 Wildcards............................... 21 WRITE................................... 32,33 X XTAB ................................... 56 87 SORITEC INFORMATION REQUEST FORM Yes, I'd like to receive more information about Sorites Group's Econometric software products. ( ) Please send me information about SORITEC Version 1.06B. ( ) Please enter my name on SGI's mailing list to receive information about new SORITEC releases. ( ) Send me the SORITEC Reference Manual. Enclosed is (U.S.)$25.00 to cover the cost of the manual and shipping. ( ) Send me the latest release of SORITEC Sampler, including a bound copy of the SORITEC Sampler Reference Manual and a copy of the SORITEC Reference Manual. Enclosed is (U.S.)$50.00 to cover the cost of materials and shipping. Please print or type your name and address in the space below: Name: _____________________________________ Affiliation: ______________________________ Address: __________________________________ ___________________________________________ City:________________State: _______________ Country: ____________Postal Code: _________ Organizational affiliation: ( ) Commercial ( ) Government ( ) Academic ( ) Other ______________________ What type of computer do you own or use? ______________________ How many computers are at your address? _______ Complete and Mail to: The Sorites Group, Inc. P.O. Box 2939 Springfield, VA 22152 88